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The ftiAtb Mi- F6rce Base Ccjipputer-Base.d Education (Pt AFB GBE) 
project at Chan'ute^adopted the* mastery learning technique dn/^their 34 : 
lessbhs- and set the- mastery criterion at 80% correct on the end^of-. . £ 
lessc?n test.' They used the perfbri^ance result of each critefapn- i 
referenced test (CR^) in two ' different v/ays: (1) fo:f ; aj^sessihg th^ 



individual performance ^ and t2J for evaluation, or m6)^^mi:^y^ypxm^ 
Chariute's fcdntext, lesson evaluation. v;:: . U^:^^^- V'"^ 

The adoption oi; a crieerion-refetenced' festing approach t pv^aj^i^ tid n 



raises two measurement issues that h^^ relatively Ides importance on nomr 
referenced testing. The issuisS are (i:| definition of mastery, and (25: a 
priori stanfiirds. These issues st ill remain uft^Q^^^ are receiving t 

• ' •••t^ H^ : ■ _ \ \ __ ■ _ 

increasing attention. A large number o£ articles relating to this subject 



have been published,* but the many definitions of mastery are by no mean-s 
equivalent* Tne concerns of these ; articles are limited to the use of 



c 



riterion-referenc^ed testing for individual assessment, iiei, judging 
whether or not a given student has mastered a given instruction to be 

c - > . , ; ■ , _ . : 

learned to some suitable .-level of mastery (Blocks 1971 ; Emrick, 1971; 

Millman^ 1973;.Besel, 1971; NSvick & Lewis; 1974; Roudabush , 1974| Huynh, 
1976; Linn, 1977).. ' 

One ^purpose of this paper is to exainijie - the appropriateness b>f 'the use 

Of CRTs as a mean of controlling an indiviidual student' s advancement to 

■ • V ■ ' • _ _ '% • - _ " - 

the next levei bf instruction or retainmeht in the current unit of ^ 

instruction in the'FLATO AFB CBE Pr.ogram (or project) at Chanyte. 

Our other purpose in this paper is tp turn the focus from the aspect 




: of iMi^M^al^^^ tha t pf pt;pgram*eva^ whicH^ requires the 



es^ablis'hiheht b.£>.a criterion rate for .yalidatidn' of ' a lesson, so that a 

•• - ■ , . ' ~ 

• ■ ' J ■"' '■ • ' • ■ • ' _ - - - - _• . ^ _ _ - ~ _ " ^ _ ■ _ 

'lesson would b^, cori0icierSd validated if the percentage of failure rate at 
th^/Hnd of the lesson was less . than the criterion> 

Jfid.jthdugh there iis^ inatheiff^tical duality in both aspects; of criterioi^ 
referenced; testing V it is* tUlt th'^ program e^^Sluation aspect has not 

received all the at teat idn- that it deserves. One reasbri for this is that 
the results of evaiuAtibn m^y dall for expensive reivi'Sibns in lii^^rj^ctibhal 
materials, at leas^ in ttaditional teaching 'settings; However^ PLATO 



provides an ideal situation for prograni evaluation because revision of 
lessons -qah be dbtl^ with relatively iitfcle trouble and expenses 

Therefore,, it is impbrtarit and necessary to explore reliable methods 

' " ' ■■ '\--._ ■ - • - -' ■ 

that will help to improve" ■the quality of CAI lessons. 
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Mastery learning' strategies have been used in many edacationat ' V 
settings since Blddm (1 968) ad^dc^ted them in the late' 1960 'si. In .this: new 

;appr6ach:tb instruction^ a:inastery level is set for thie material td bei ' 

■ - - . . -_ ^ •. -■ - ^ - L • - • . 

learned so that' a major it^^bf the students must attain- the cri±er id h level . . 

inteiesting findings^ about mastery learning strategies were' reported r 

^ ■ 

'by'Carrdll (1963), Atkirisdn (1968) Block (1970), Kim, Hogan; et al.. 

■ ' . ^ ' , " ._ . _ , ■ 

(1970f i971) and nahy other's. According-. td Bldck (1971), mastery 

" --^ \ ^ . _ ■ _• ' . ■ , 

learning allowed 75-90% of students to achieve the sanie levgr as the top 

' - • ■ w ■ 

25% of students in usuatiy achievjed with typical grouped instructional 
methods such as in regular class roomsi ' v 

A similar study by Kim et a.l. ;(1970, 'l971) shdwed that 72-% of 
approximately 5800 studehts^ih foreign language classes achieV^d^a mastery : 
criterion of 80% correct on final tests under the jnastery cbhditibri while . • 
only 28% of the traditional cotdition achieved, this level The high ^ . ■ 
percentage of 'student^^ achieving criterion in the mastery condition shows • 
the effectiveness df this^ strategy df instructicm* How^^er , these remits 
may al§o-^e due partly to the quality of lessbhs given to the students 
durifig the experinfnt, or raay even be d^^to the kinds . of '^^tests that/'we|:e 
given to the students in order to exanine the degree of mastery achieV'ed in 
the Iristructibnal unit to be learned i We. may be able td say tha<: the high 
quality lessbhs produce a' higher percentage of .success - than do Ibw quality - 
lessons if the tests given at the end bf the lessbhs are cbinparable tb bhe^ 



another . 



The . eK-peri-enced . iristrubtidnal designier; might say that the quality of. 



instriictibh.'may. be deterT^iiied 

/ ^ A 

and. the "quaiity and types of 



by the appropriateness of instruct ional cues 
reinforcement given each student, as weii as 



the ambuite of partrcipation^and practiGe experienced bj} each student. . ' " 
Therefore, determining the. i^uality b iristriictidri -is a nuit idinerisionai arid 
cbnplicated taste-. It is vet^'. ^iiff icult to measure ^ these f^cto:^s arijd develb.pVT ; 
a method of settiiig^valtciati-onVcriteria for /CAI lessons based on the 
quantitative data from such qbmplex variables'. Since our concern: is ^o 
riBstrict the discussidri to the ^liatftitatiN/fe riethd setting the ; ■ 

validation criterion bf. a ,r>iven L we will;s£4rt fearaining the ;/ 

v al idat io n cr Ite r io r) tha t ' haa b e en lia^d ,in the army, and the PLATO AF? CBE 



Pro^ran at Chanute Air Force Base r> 
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2 . 2 Val 4d.a^^o^ Criterlbri of Lessons in PLATO AFB CBE ProRC am 

. . ■ -- .. . .i," . . . ■ :: ; ,- ■ V - . 

• The Plato iV computer-based e3ucatior^^yst em, in cTevelopnjent for over 

a decade at Ihe University o.f lilf^'is, was used ia the training program of 

Special and^Generai Purpose Vehicle Repairmen at Chanute Air ForcGf^-Base*^ ; 



(Dallman, 1977). The 37 CAI lessons in the program^ comprising almost 30 
• hbiirs bf instructibri arid 3*7 tests, are implemented bh the PLATO system- 
albrig with a routing air.ogram that proyicfes 

rranageipent . The 37 lessons ;pre homogeneous in ^^iS^';eC:t matter and-r ^ 
tutbtiikl in style^tir the. most parti They^ are arranged in mastery '' 
learriirig fashibn; sb -that students must achieve the nastr^ry le'vel of the 

• .' • ■ . .• ■ ■ -V ;■■ - . '^ ■ ' 

t e s t wh i ch . wa s g i y en a t ; t-h e, end of 'eac h less o n in order. ^td be adv an c ed 
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Table 1 



Susmnary of Master Validation Exams £h the Chariute PLATO "AFB ~CBE Prdject 



• » 
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3 

tessens M 


Validation 
D^te. 


Size of .tested 
out sample 


^ % of/ % Of 
Success ' 'Failure * 


Total 

"NT ' 


if of 




103 


30 


10 June . 


63 


89% 


11% 


. ^93 :■. 


83 




. 104a 


- ;,. 3d, 


14 April 


114- 


■ 94% 


/ 6% 


144. - 


134 . 




104b 


30 


14' April ; 


113 


. 86% . 


14% 


143 ■ 


■ 124 ," 




105 


30 


14 April - . 


102 


.88% 


12% 


132 


* • 117 ■• 




-,io6 


; 30 


19 Jun^ 


33 


. 82% ; 


18% 


.63 


• -i^: 54 




201a 


30 


28 May 


99 


90% 


107^ 


129 ' 


. V 116 




201b 


30. 


23 May. 


109 


72% 


28% 


- 139 • 


■ 105 • 




202a: 




' 18 Aug 


33 


82% 


18% 


63 


• '54 




202b 


f 


28- May 


90 


98% 


2% 


120 : 


115 




203a 


30 


28 May " 


33 


97% . ^ 


3% 


63 


■59 




203b'-- 


30 


13 June 


33 


94% 


•6% 


■ ' 63 


58 




203c 


, 30 


18 Aug / 


33 


91% 


9% 


63 


• 57 




204 


30 


- 18 Aug 


33 


94% 


6% 


63 


58 




2Q5a- 


30 , 


,15 Jan V 


,' 33 


■ 79% 


21% 


63^^ 


53 ■ 




205b 


3d ' 


15 Jan 


33; 


82% ■ 


18% 


63 


54 




206a 


' 30 


13 June 


9Q : 


82% 


18% • 


120. 


101" 




206b 


30, 


25 June : . 


' 65 


82% 


18% • 


95 


80 \ 




206c 


V 30. 


11 Aprii , 


118 


-■ 95% 


5% ■ 


148 


139 




207 


. 30 


15 Aug 


33, 


91%. ' , 


9% 


63 


57 




301 




25 June . 


109 


■' 79% _ , 


21%. 


139 


; . ii3> :' 




304 


{30 


25 June 


65 


82% . 


18% 


*" 95 


^ 80 . 


, i 


. 305 


' 30 


^ 18 ^May ' 


109 ' - ; 


96%' 


4% • 


• P9 ,■ 


132 ' ' 




307 


30 , 


14 April . 


, 130 


; 81% - " 


19% \ . 


■ 160 .V 


• 132 




308 


30 


"^18 May 


, '109 • ; ■ 1. 


63%, ' ■ 


37%. 


139 . 




t ' 


. 401 


30 


¥7 April 


* '.-442 ' \ 


. 83% - 


17% • ' 


172 


■ 146; i ■ 




,402 


30 


^'8 July 


65" 


.79% 


21%- ' 




78 • 




■ 403'. 


•30 


.30 June 


65 : 


. 79% 


21% 


95 


78 




404 .- 


: 30 


> 2 Sept 


' 33 . 


'■ 100% 

'■■>.- >"« , ■ ' 

_ — X - 

g validation 


0% .. 


63" 


' 60 - 




% is 


the sat 


liple size usaA 


for estabilshin 


■ \ ' ■' 
dates. 








♦ 






' ' /• ... . ^ .."-S 


ia ::y : : 











(Tab!?' 1 cphtV) 



Lessons , 

405a ^ 

.465b 

405c 

405a 

406 

407. 



30 
38 
30 
30 
30 
30 



ft- : 



Validation 
Date 


Size of tested 
but sample' * 


'yi% bf ^' : 
"Success - 


: % d^/ 
* failure 


Total ' ' 


Success 


26 Aug i 




- '; .iod% . 


0% ' 


63' .. 




26 Aug : 




91% 




. 63 ■ 




26 Aug 


" ■, , ■33' ^ ■ ' 


, 94% 


6%^. 


63 


■~ '"'58- V..: .' 


' 2 sept 


' 33 


73% ' 




. .63 ■ ■ 


•,■ 51',' ■ 


30 June 


'''' 55/-": - 


. r 95%' ■ 




95 


:M ' 


2^/ Sept 


33 ■ 


• - .88% 


" 12% ■ 

. y ; ■ 


• 63 : 


;56 ly 
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t P. 1 |M , 5 i s P h . If .the nias t e ry l,ev e 1 is • no t ^ j^^^ he* s fc ude n t 



'intist repeat the le^saii. consist mostly of matchiitg and . 

* ■ ' .nultiple-clioice iteris^^|Iastery levels v^^^ aimed' at 80% level, but the 
V actually used cutof^jar'^ sf^^^ between 752 kt^ 90% of' the items. 

' atisw^red correctly, ^xest^ iefi^ frbm 5 to 20 itefas and , the scores' 

'.-••''^ .on the'first try oi 'each itani are 'summed' to yield the total score of . . 

* . each test * The .tests are called IJVE, .fiyr Master Validation- Exams. ^ For 

'example, tfie test at the 6nd of lesson lOlHs c:alled HVEiOli The • 
\ descTiption of their lessons is g.iyen in AppiiBndix2« ' ^ * 

' . . ' Avlessbri is said to be Validated when 90%'^f the students have- • 
achieved the 'given criterion level of 'iJ5% - 90% ' of ,the items answered 
'correctly In the fitsp attempt on ^-^ch master y^lidatioji^exami The sample , 
consisted * of , about 30 students f rdm/successive clsses- Nb'majbr 
nddif icatipris of less'pris were. made until all students in the sample 
f ihished *the lissoQS. All lessons were v^lTSated according to this 
.* criterion betWeen April, and. .September df 1975. ffie es^act validation dates 
o,f- the lessons '^re shown in Table li , In order . to validate the ' 

. ' ' ■ \. . ♦ > . * ■ • • ^, ^ . . . _ 

validation criterion,, the .^tessdriS that were said to be' validated wbte- 
left ui^e ha rif>ed during the evaiuatlbn^perib^ -and' were^ tested- oh more 

Students who came in after the validation dastes were established. 

.J ■ . ■ - ■ : ■ • ■ \\ \. ' ^ ' ■ ■ ' ' . 

. : ^ , * it ii[iter'esting/to -note/ tfta 15 put of 34*lessons achieved the 

critecion levef-'of ^0? 'success -rate alt the end of the. evaluation period, - 

* _gl-tr)bugh all lessbhi^ are labeled "validated Indeed^ this result Can. be 

exf>ected ^nd 13 hot very ^p'risihg. The: next sectio.ns will tie defvoted lor 

_i. ■■■ ■ " ' ' . ; ■ ■•■■#<. 

V explaining the reason.' . r . * 

i.essbtis available for the analysis was reduc-ed^ -iq 34 from 37. ^ ' ^ . . 



2^3- Bayesian-Blnomial Tfedel 



• By 'applying a' samjile b^hbraial model to, the . first; 30 subjects with 
_whon the validation dates were' established, we obtain the result that . . 

the^ p?o^biiity of failure to meet the validation crite^i'ipn lipon follow- 

' ■ ■ ■ ' _ '. , . , / . ■ ■ ■ 

up tes*tirig is 36*3 % Therefore, 12 out :df -3^ lessons are predicted to . 
b^ failures, '.^itp^ilarly, ^t he posterior disjtribut ion of Bayesiari blnimial 
.model where beta fuhctibh was taken as a prior distribution predicts' 
;.59.1% failure to meet the- val idation c.tdterion (this calculation vjas 
done by the PLATO version of CADA developed by Mel Novick) • In other 
words, 20 out of -34 lessons are predicted to miss; the validatidri • . 
criterion. Table 1 shows that 19 lessdhs havea failure rate greater 
tiiafi, 10%, which is very close-to the number '(20) predicted by the. ^ 
Bayesian binomial jnodei. This facjtjjindi^C'ateFf.^ ^ to,; 
introduce a more accurate validation criterion, for lessons; The reader f 
might wander Itow the prior distribution Was ctoseri here. It was bas^d- 
bri the belief of the peb^f^e who * part icipat^^^ PLATO AFB CBE project.. 

Producing a lesson to be. used on the PLATO system is : not a simple 
task. Ilany sBeps are involved in the completion of a lesson, including 
tryout with students and gathering- empirical evidence- which might indicate 
further revisioti d| mo^^ficatidn of the lessons^ No unique method for 
lessbri-revisibn bpcratibn,* based on the thebries bf educatibnal psychology 
aVid educational measurement i has been developed for use on the PLATO 
rjystem. As signals pointing to the need for revision, some authors, choose • 
^to. look at ."Area Data," which is ^collected by the computer, and cb'fisists of 



eiapsed time in the area / ( a .segment -of instruction) ^ number of * . 

questibns answered correcti/ on the first try (dkf ' s), number of 
incorrect responses to questions (no' s) , numbet of correct responses to ; 
questions (Ok' s) i and number of helps •requested. Others design and 
implement their :Own data collection routines. ^These data usually give ; 
iessoi?y authors .a" very rough id e^ of the how well their lessons work with 
. students -indic^l:^ thi^'^/krei^^is^w^^ 

f rbuble going throiigh., * ^ , / / ' ' • 

Thus, it i% possible for .ajPLXTQ Ifessan aut tb< ;ha\7^ Some degreg erf ■ 
confidence in the quality of his lessons; by the time the lesson becomes a 
niarly finished product. ,The degree of his confidence might ^e^nd on his 
kridwlel[ge of teaching strategies or his past experience, if he uses 
teaching strategies such as mastery, learning,*' which has been examined by 
many researchers -a^ is known to be highly effective, then it is natural to 

assume that he would be highly confident of the quality of his lesson. If 

■ \- 

an author has substantial experience producing lessons on the EfcATO system, 
"and IMs used thein successfully in his class , then his. experience will 



■ . ; . . , ; ; ; ; ; i . ; -r . 

assure him of the success of his hew lesson. : / 



it must be true that lessons in whtili the author has high 



confidence are tttore likely to producp.J3a^';it'igl]ter success rate in a future 
,use of his lessdriS. Suppose p is tr>^ true probability of success 



associated with a. given lesson; in other words, p % of students achieve 



the mastery level in a , population . In general^ a Bayesian density 
indicates a state of belief about a paremeter , such as p here, 
intermediate between the estimate "I know nothing about p" and "i know 
the exact valiie of p." - , 



■.. - ../.■■ 

> Two types of densities are used, one being p\e prior density, 
representing /belief s about the. parameter befbr^ bbservatibt^.are obtained^ 
and the bthel: being the pbster!j.br density^ ^prensenting beliefs after 
seeing the aata. In our situation, the ta/sk is to infer the value of p 
from an observation It is clear that/p obtained -in this ^ay carihdt be 
exact: tha/t «£0 students passed ^he test but bf . 25/ students is quite a « 
ptoTjabie rtiimber for lessons with the yalua /of p anywhfe,^^ between .65 and; 
: • 90^ BiitV;thi .t)t$S^v3tiDn: tSa£' 80%; bf .\it:aSenjts achiii^ed' ' tiie 'jw^Mtkiy^^^^^ " 
/ra^k&jj p-'irpond .8 more likely for the lesson than p ardUiid .3, so we should 
'e^Stiiiiate p as .8 if no thirtg else, is knbxm . a!)b^^ lessbns - 



If the author has sbine' infbrmation abbqt the, lesson ^ such as that since the 

_-:-» _. _ _ __ _ ii..: _ 

lesion is dealing with a simple introductory task, the value of .8 is 
somewhat* lower than it should be, then we would be. more inclined to think 
that the true probabi;J.i.ty of /success associated with the lesson is higher 
than .8. If the author has- substantial experience in prbducirig high 
quality lessbris' in past years, then his new lesson wbuld be more likely tb 
be considered to have a higher true probability of success than .8, even 
though the observed success rate is -8 in the sample. Therefore, our 
estimate of the true probab iiity p depends not only on the,; observed value 
X, but' also on what we know about p before observing. Xi 

The previous knowledge can be expressed by a pribr density furictiori 
f (p) (br^ alsb, called a prior prbbability density function)*. The prbduct ^ 
of f(p).and the likelihood af unction f(x|p) (i.e., the conditional " 



__. , J ^ 

probability of x on given p) gives a quantity proportional to the posterior 
density fuhctibri f('p|x)': 



£(p1x) = f(,xrp)f(P) 



■f where f(xjp) is called the mo(tel density function instead of likelihood ^ as 
^ in Bayesian statist icsi / - : 

^ The model cietisity is used for inference in traditional statistics , or 



Sampling theory. It is clear that Bayesian statistics uses more 
_ind|^nnation' than traditional statistics does;/ i«e'.^ the prior density 



: f unctiiani eonseqaently, Bayesian :sta titties /Will'^ijrovide us. ;with more 

'i^yj' ' " - .." - . ■ "-- - 

' ' accurate information, at least mathfematica^lyir than* triadttional statistics 
' will if a choice of bur .prior, density is the right one. Vjl-ndeed, it is 
possible to demonstrate such ah exai^plei/especial^^ number of 

observations is 'fairly small. But /it true that the model density^ 

^ conditional fjrobability of x given p, wilt have most influence on: the 

■ i - - -* - . ■ - ■ - ■ - / — - - 

pdsteridr density when the number of observations is larger ' . 

■"^i ' ■ ^" _ ■ _ ■ ; V ; ; . ■ . _ -.^ . _ 

A detailed discussion of Bayesigfi bindmiar it^ddel can be found 
/ • elsewhere (Novick and JacRsohj 1975; Ferg^sorii T. ^ 1971). We will show 

only the Bayesian densities in this Ipapei;. If .we^ assume the prior bel if 
of p follows a beta* distribution, then the prioJ^densi ty^ f(p) is given 
by a beta function: 



pa-l(l-p)b-l 



f (p) = • I O^pili a><5, b>0 

B(a,b) 



the model density f(xjp) is 
pX-l(l^p)N-x 



f(x|p), = 



B(x,N-x-l), 



11? ^ 
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the posterior density . f( oj:: ) is f»ivcn by 



f(p|x) 



where R(a,b) - 



f3(a>f::,b+N-x) 

r(a)r(b) 



, il' is the niinber of subjects. 



r^(a^+ b) ■ 

Application of the Bayesian binonlai nodel to ^ 34 ChanUte ^les will 
be demons t rated in the next settfon. ' • 



" 7 



It 



The riile' f or iBstablishing ^^alidatioti of a ie^son was^jth,at 27 bf^^^SO 



students "entering the lesson successively must pass the, mastfery; test' 
given at) the end of the lesson; if this criterion was Tibt; met , some 
'v revision of the iesson.was carried out. If ■we^cohsider the 34 lessons 

■ ' ' il - - ^^--^-'- '^-^-^t- - 

are hbipbgenedus, as Pailtnan. (1977) sta tbd in. His paper^ thel moael • 
denstt^ farictidh de|iyted| f^rb sample of size 30 with 27 saccessfoi 
attempts' predicts aJ 63.'if% fchahce of Success for each lessons iiji! future at 
the Itfdftne when the validation date was established. , v ; 

The correspondirjg prior density in our situatian is obtained frbinv 
t he^ 'valid at ibn criterion t^ich has beexi . ijsed in GBE programs ia the * ^ 



Arfiy (Bjjfanson et aiij 1975): 27 of 30 kchievliig ct±tBy±<m^^^^ 
beiiev^ that this, vulle was adequate tb determine the cutoff point 'ifoir 



terminating the process of lesson mbdificatibh and beginning to gather 
data fir 'evaluating' tfie PtATG AFB CBE project at Chanut^. The belief 
that a, 90% ratfe of jsUccess in thirty succ^ssivQ subjects is M adequate 



5S in thlj 
lessons. 



can be thought of ^as the' prior 



criterion fibr v^lidatirig 

-'^^ ■ ■ -- . 

condition. Therefore^ the same beta-binbihial distribution function as 

• T ■! :r ■■ ' ■ ■■- . 

. __ L'4'-_.l- • ■ _ / • - 

the model density' fdnction is taken as a; prio.t' density distribufeibh in 

this case. , i • * ■ . * 

Afiplying jBayes' theorem to prioi: arid model idesisi ties, ""the posteribr 
density' function is given , by be^ia-binbm^^ furiotidn. B(53i 2, 6i8) with a 

^ •". ' • - . - - - _ — " -■ 

mode of .87 and standard deviatibh bf -.04. The 5^% Credibility interval 
given by [.8714,, ,.^2*44], in \^ich mode ii9 ahd^eali^ .87^r^e>Lncluded. 



is 



In Bayesiah sb^fcisticSi the interval [ . 071 4 * • 9254 ] Is called a 50% ) 

f . _ I 

credibility interval for the ability ('or success irate) because the 50% is- 
the measure of the stretipth 6f our belief, caking into account our prior - 
'khbwlffti::5e and bUr bb^ervaticyn that tiie student's (or lesson's) ability lies 
in that interval. In particiilar [.87, .923 is a 50% interval " between th^ 
25th and 75th'^ percentiles ahd is called the highest-density regibn in the 
belief, a 50"r:iDP,. ' tlie -length o'f the, interval '..92 - .57 is called an 
interquartile range arid is us^d as -a measure of variability of v 



d istr ibutibri. • , \ " / " . - ^''^ 

As seen in TaVle 1, we have further bbs^jrvatibrii? nacfe after the 

validation dates Vere established. L^t us extend bur discussibri further. 

■Table 2 suSmarizes the results of tl^e payesiari;,l)eta-binomial analysis- 

:^ ... . ^ \ V. ' - --\---V._-:.. , 

,f br .eath lesson baseH on the expanded sample and . nevly observed success 
rate. VThe-mbdel densit^y functions of the lessons f^^ven in -Table 2 ware 

. . _ _ ] '■ . . . _ ■ _ _ . 

.derived frbn the hew sarr.pie of size ,'>iven in cb'lurnn 8 and number of 
succ^ses in collinn 9 of Table 2. The> naraneters bf pribr ^d^risity 50% HDR 
and probabilitiero of *7t lar^^er than or equal to .9 (Prbb(-TT^.9) , are p,iveri in 
Table 2. Fron the last coiunn 6? Tabre 2. we may. select the lessons whose 
probabilities' of being validated lessons are greater than .50. Since^ ail 

standard deviations, and iriterquatile ranges are sinall, i.e., mostly less 

«... - , ■ - ' • . 

■ . ^* ■ ' , ■ ' • . . ■ 

than -05, tlie\ prbbnb ility that tt is greatier than or equal to .85 will be 
drastically. ;rea'te,r." 

For ercan'pie, lesson 103 lias Prob( -iT^.SS )^.86 while Prob ( "m>.9) = .25 

Therefbre, it is .r'ecbnnended that the validatioA criterion of 90% be 

■. \ - . : ■ ; - • - ... : . \ ■ . . 

replaced, by a 3li;^,htry higher value 92% of So. If we defined the validaton 

criteribn by a slightly higher success fate, say, 28 bUt bf 30 students 



r 



' Table 2 

efedibiiity' Iht'^rvals" b£ Master Validation Exams 



Lessons 
103 
104a 
104b 
105 
106 
201a 
201b 
202a* 

• -202b 
203a - 
203b 
2036 
204 

: 2€5a 
205b 
206a 
205b 



Observed 
Spore 

If = -^^^ 



134 
144 

124 
143 

41X 
132 

54 
63 

116 

129 

105 
139 

54 • 

63 . 

120 



= .931 



-;i867; 



= .886 



= .857 ■ 



= .899 



= .755 



= .857 



= .958 



by Baysiaa Binomial Model 



Mean Mode S.D. 
.89 .03 
.93 .02 



59 

§4.921 . 

57 
6.3 

63 

53 
63 

- -eu* i 

II = .857 

63 » 



.905 



.921 



=; .841 



. ..89 
.93 
i87 
.89 
.86 

-.90 

« 

.75 
.8p 
.95 
.93 
.92 
.90 
.92 
.86 
.86 
.85 
.86 



.86 .03 
.88' .03 



.84 



-.75 
•84 
.94 
.92 
.91 
.89 
.91 
.85 . 
.84 
.85 
.85 



J 05 
.03 
i 04 
.05 
.02 

.0 

.03 

^04 
.03 

.04 
.05 
.03 
.03 



50% CI 



P(ir2..90) 



109.2 13.8 .8744, .9120 .36 



J3^.2. 10.8 .9l57,^v9444 -.87 



l'23i2 19.8 .8467, .8851 .08 



116.2 15.2 .8665,^040 .25 

53.2 9.8 .8238, .8842 .10 , 

115.2 13.8 i8800, .9160 .43'; 

-104.2' 34.8 . 7280, ".7774 , .qO ' 

53.2 9.'8 .8238, .8642 .lo' 

141o2 8.8 .9340, .9588 . .97 

85.2 7.8 .9052, .9425 .74 

57.2 5.8 . 8959 i .9425 .53 

83.2 9.8 .8811, .9228 .47 

57.2 5.8 .8959, .9425 .63 

•79.2 ' 13.8' ..8337, .8826 ' .08 

53.2 9.8 .8238, .8842 .10 

127.2 22.8 ' .8324j .8716 .97 

106.2 18. Q .8331, ,8758 .05 
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<^0 



(Table 2 
Lessons 

2e6c ' 

207 . 

301 . 

304 

305 

307 

30§ 

401 

402 

4§^3 

,404 . 

405a 

405b 

405c 

_4b5d 

406, , 

497 



cont'd) , 
Observed 
Score 



148 

57 
63 

113 
139 

80 
95 

13^ 
139 

160 

96 
139 

146 
172 



= .939 
= 1 905 
.813 
= .842 



= .826 
=■ .691 
= .849 



78 




95 


■i 






95 




60 




63 




60 




63 




57 




63 




58 




63 




51 




53 




89 




95 








63 





Mean Mode 

.94 .93 

.90 .89 

.83 - "i$2 

.86 .85 

. .94 ' .94 

.84 .83 
.73' ' .72 

. .86 .85 

.84 .83 



=. .810 



= .889 



.84 .83 

.94 .93 

.94 .93 

.90 .89 

.92 .91 

.84 .83 

.93 .92 

.89 .88 



S.D. 
.02 
.03 
.03, 
.03 
.02 
.03 
.03 
.03 
.03. 
•P3 
.03 
.03 
.03 
.04 
.04 
.02 
.04 



a b " 50% CI '^(if> A0) 

138.2 9.8 .9255, ..952i .94 

83.2 9.8 .8811, .9228 .47 

i39.2.» 29i8 : .8073, i.8466 :iDO . 

106*2 18.8 .8331,^8758 .04. | • 

•158.2 , 10.8 ... .9282, .9528 . .96 - . 

, 158.2 31.8 .8175, .8538 .00 • ; ; 

1222.0 46.8 . 7020, .7485 .00 

1^2.2. 29.8 . .8380, .872 .00 

. 104.2 20,8 ".8160, .8604 .013 

104.2 20.8 .8160, .8604 .013 

86.2 6.8 .9174, .9522 .84 

86.2 6.8 .9174, .9522 .84 

83.2 9.8 .8811, .9228 .47 

57.2 15.8 .8959, i9425 .63 

77.2 15*8 .8103, .8622 .02 ' 

115.2 9.8* .9117, .9451 .82 

55.2 7 .'8' .8595, 19137 .31 .. 
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' . ■ ■ •* . . ■ • 

achieving the m*tery level In a successive sample^ then the validatipn ' 

______ __ __ . ._. .. p__ :^ ^ 

dates given in column A of Table 2 pt-rr>.9) would be later daj:es but the 

■estiiiiation of true probability of success would be tiiuch improved' 



Lesson 201a has a 90% sif^:cess rate in ah observation of 99 students who 
ehterea_ the lesson after the validation date ^ May ^Sth. This observed 
success rate is the saine as the validation criterion^ it is interesting 

to note that -the 50% HDR (.88, .916 ] of the new prior detiisity based oh 

_ • _ ■ . . _ . ^ _ _ _ _ ^_ . _ _ _' . . 

the sample size of 129 is slightly narrower than that : of size 30 [..8714^ 

.9244]. In generalV when the number of stt^ents increases , the 50% HDR 

gets narrower.- Also**' you will notice that the value in die last column 

of Table 2 for lesson TtH^ is .43, which is larger, th^ti Prdb(-rT^i»9) = ^ 

.409 when the sample size is 30. - Therefi^e^ bur credj^blity of saying* 

that lesson 201a will have a success rate of 90%' in the population from 

which this sample was drawn wilt ^Scr^ase if the sample size on vdiich 

the model dinsity was based increases^ ^ . 

• . Hence, ^ettlrigl the most appropriate validation criteridri for a lesson 

depends on two factors: success rate arid sample size. The disdussibri of 

these two factors will be carried mathematically parallel, ,in other words 

mathematically dual; takiffg the sample size as the~TtiSber. of items or the 

test ;,iength, the success: rate as the proportion of getting a correct answer 

far an* itemi in the next chapter, we will switch the focus from the former 

that is oriented toward the success rate of a lesson, to the latter that is 

for the success rate of an individual in a test. s 
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CRITERION *KEFER;ENCED TE'St ASVASSESSMEOT .OF ^TimENTS I^RFO^^ r 



3;! Problems tn Criterion-Ref ^renced Tests , / • ^ \ 

. " • '. ■ . " ■ " , ■ • 

Crit^ridri^referehbed testitig'ha*$ gained much ^tention from 

educational measureraent and tilting specialists ;J.ri recent years. The 

• ■ • • ^-i - _ ^ \ ' ' ' 

object of criterion-referenced testing is riot-^o distinguish':fihely 

§ia6ng subjects, but to classify subjects into mastery and non-mafiterjf ' 

I'roi^^^ .Robert ■ Gl^$er (1963) stated ^ that, th^ measures^ ot' €RT^^^^ 

- . " - _ » -----.'''^V v ' ' 

an [absolute standard of quality' while tho.se of NR,Ts depend on: a relatiy-a t 

standard. CRTs- af i pften;used i^n' conjunction witlj;.in^^^ l.-^ 

programs that -tnaximize the number of students attainln'g a giu^n mastery ' * 

; . level and 'iiiininize the variability of test scores -whiieNnonn-E^^ • 
tists (Mte) are us^d in selectibh or screening a SAibgroup. of\,exan^ine^ ' * ' 

. p fed ic\tjLng students' future perf orrilancl&s , arid evaluatioti of ^ ' ' V ; 

■ '■ >^ ■ . - ^ ^ 

inStractionai programs.' V ; V ' » ^ . - / / ' 



The concept's of crite^ibn-ref er enced testing are quite different 
\ from thds^ of. norm-- referenced testing. Strictly speaking, the tesb \ 
f ^ scbtes of NRX-arev!.assUmed tfe be distribuned yormadiy while; those* of a . . 

CRT are liighly skewed.' Th^ variability in ^scores of a SRT is ^iargfe • 

while that of a/ qRT is sinall. ^ AltfibugH, th^se dif^ererices are'.g^^erally 

* ■■ • ' . . *' ;^ ^ ■ . ' : ' ■ : • . ■ * " 

expected but need 'not be observed in pracrtice.' Statistical measures iri 
: ■ , ' ; ^ ' / ^ . ■ * 

•■ ■ ■■ . . I. . _ _■ ■ _ - _ ^ - . _ ■ 

the :classical test theory modej^; suoh^ as rejia'bility and valid ity* ate, • . 

■ : \ ' \ ' : : ' ' ' . . ' " ^ ' ■\- 

* cl6fin^ on" the bas'is of 'a^suiriirig that the: sjtandard deviSf of *any NRT ^ 

r» ■ _ _•; . ^ ^ ^ ^ ... . . . . *, 

^ is ai^/ays' positivjp and adequately large. Thetefpre"; the definition of ^ 
teliabiiity a^' thq ratfb of true scoVe vktiance to qlrserved score ^ < 
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variance can be a .meaningful index their e. "Vllit^ reliability ^ 
increase as th^Vtest length (number i?f items) increases and' herice; the ; 



variablity of »te^t. scqrea^^i^^ The test length of ^'a CRT 'is 

'•' ■ ^''^ r" "i;*'- ^ :■• '^v'?*', . ... " ■■ • ■!i_L_ 

■ uisualiy sfibriv say 10 br 15 items , rand often most .items of a test^ are ^ ; 

answered r<;b rir^cti^^^ kll students t:^ke> 3ie 'tei^t; Therefore the 

lab illtjr j^o £^^^(3^3^ :b^ sd^i sf ac to r ily ; lar g e. i^S ^fat^'as tSl^ ' 



•a: 



aathbr lSbws|i /li^n^^ have^a (^21 ^^i^^^^^ty of only ab but* Q. 5 or 



; i - * Since, it is.a :Cdti^ of crlteriop^-xpf^re^ed , 

• ',/ "§tud^ht>s ^re-e to acHiW^- tKei ley,el of jnastervi, 8,a^ 90% cbirrect, the \^ 

' . / ,rq6served:cqores becSme ^ there are subjects with true J. 

'j^cor^ /ne^ -tte "cfeiltng-JSor the "floo* it djecomea impiausible to assume. ; 

' :r1v..v ' ' ' - . : ^: : ■ ^ ^ . . . •._ _ 

: i i ' that* the' err b^^ indepenaently of true scores 

for those^ hear the 'bbiiildi^^ ,N9.Ts don't u ceiling or flbbr' 

. ■■-•..i';"-..-..- . ^ ' ■ - u - ' ■ . ■ ^" ■ " '■ , . " i ■ 

if-. ■„ . .. • _* ^ ■ ' • ■ _ ■ * ■ _ . 

, ' iffeGts. Their s do res are.f^istributed arbuhH t are . ^ 

_ ". _ -v- _._ ■_ _. * . « ' . ■ 

V • < .,4eldbm'ne^a:' either extrone:. in such, a test^ it, is reaspnable to assume . 

^' ■ ' " ' "V ■ •'' " '■ ■" ■ ■ '■ " ^ ' ■■ *• '1 ' • • ■ 

* tihat etror scores are; due to. somj^t:tIi^g, independent of the subject's true ? : 

■■ . • , •■: ;, V _ . • 

p- -ab ill ties', such as fat igu^ ,^ arixiettyi, etc* " ^ 

^ ^ \ \ c ' : Lord; arid Noyi^k (1968) argue%abibgt, the plausible^distribut ibrial fbrms 

; ' ' of observed CRT scores arid true scqb^s In 'Chapter . 23 of their bbbk^ ; . 

V ^'^^ ^'Statistical ;Theoxie% of Test Scores." We will followjz^eir steps 

t ' arid adopts ^th^_ binomial error 'la^Jdei for CRT scores i >The binomial error 



mbd^e^. assumes' l:hat^:^it e^cili^MVE test is aim^a at measuring the learning : 

r- ' " ■■■■ 'Tf "- - ' - ' . " ~4 . 

-^lev^l^of, a topfc t&ugfifc the Vehicle Training Cburse^ then all itenis in 
the test^i^dst Ae^st^re ^the same task* In other wprds alL items in a test 
h^ve one and only one common factor with 0-i scoririg* Suppose there is a 



■.,-.H .- r V':^--. ^ •* • . . " • ■ 



ri- 



ft. ._ _ ■ ^' _ ■■ . • ' _ ■ • ^ ■ ■ " _ ■ " " _ 

^bi of items measuring the same task^ arid taJdrig^ari item blit of tSie ppol. 
is an independent event, that ' is ^ answering the. earlier items on the test.. 
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does riot affect 'the ability of a/ student to answer iate*r items Qorrectly, then 

^ ■ ^ •• ■ ' 

• ♦ • ■» ' * - _ '_ - _ _ '_ _ ■ _ ' ■ ■ ' . ' 

iwe' <5an formulate the distributibri of raw scopes Jc by /'a biridmial 
' ' distribut ibri 'with parameter 9. iri VThich 9 is the prbpbrtibri of items thit a 

Student would answer corr^ectiy over the .entire pool of items. If T ts; a 
• fixed vtrue score and e is an e^r^or^ of measurement; t|i^ri the raw' score x can 
•• be expi;'essed by. the sum of ^jthe two, x" = T + e, «and 6 is: given by . 

. • T/n ' ^- . ' • ' ' ^ ■ : ■■■ • • ' 

whei,e n is the niimbfer of it^ems in the test. Let h(x|9) be the biribinial , 
- ; difitributiori ,pf x';at any given trtie ability level 9, then the conditibnal' 
distribution •h<xj9) can be^g.iven by « .. ' " ; , 

■ ■ ■ ' /. ' ; ■ " ■ ■•if' ^'■ " V .. . ' ■ ■ 

■ :■ ■ :> ' _ ' . ' ■ , ' ' <v • 

where n fs the number of items in the test. 

^ " ■ . . - ' ■ - " ' . ' 

It interesting to note that this model does not pay attention to ' 

■ .- , ' ' ' - ■ ^ >v,-'^-- ■ . ..: r-'^-'--" 

'Item differnces. The . traditibrial tneasureanerit Indices such -as ^ ^i^ v _ /f 

^difficulty br items Uiscriihiriatirig iride:?. are ridt^ the major concern in 

" tft^ binomial error model . ^Rdther , f iri dirig out hbw abcurately a test can : • 

v.. ._ _ 1 ______ ^ • ; ,. ■ ; . 

> .. estimate an examinee's pass or fail status with respect to a given : . ». 

liiast^ry is a main concern of the model:. . : ' 

■ \< . Keats arid Ldrd (1962) irivestigated the relationship between, th^ - . ; ' 

i distii)Ution of_test scores^ observed and true scores, liie-test scores^ 

T ■ ■ . : _> . • ' . . . . . ■ • ' ' 

could be adequately represented by the hyper geometric, distibutibri h(x) with . 

- - ' / ■/ ' - - * • .- ■ ■ 

. . - ■ ■ ' - •• 

a negative paraneter and the .true" scores distributibn cdiild be represented 
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■' if : 



by the two paraimeter Seta dist:r±biatibfi".g(9) . 4 ' ' r 



g(g) = ga-l(i_4)b-n/ B(a,ti-rs+l)^' „ 



i.wKere a>0 and b>n-r. Arid' also 



"ax 



■J, 



: -I '■■ 



h(x) = I — ^ 9^(\^)nrxd0 , • x=0, 1, .. .,n. 

0 B(a,b-ttH) \ x' V 

regSessttig tfiiB -true score T oh 
given by 




^'ynean of test scores i 



: where /. is vthe ' reliabfiity 6|;: tfe 

Tri/ binomial .et'ro inbdei,^;;tii^ is ^It^^ri by 



simil^irj^^^iiati 



■■■■■ 



w^reM 2l is t^he -ratio .bf.tiim to observed- 



Sl:pre \^ariahce and is given^by * ; ■ ; ; , 



— ^ 1- — — '} 



Table 3 is the* summary of information frbi^^^^^ 

at Ghanute.'.^ - . ' ■ [ " 
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/. . . '\ : Table 3 ' " 

'The Suffitary of Simple Statistics of Mastery Validatibn Exams 



.;• test- 


^ • mean 


■ SD" ' 
■ . "> 


items 


- — a21— — - 


— --^7^ 1,- 

• 




7.388 


1.124 


■ fc: ' 8 ■ 


0.6321 




; mvel04a 


. 11.892 . 


0.442 ' 


12 


0.4910 


83 


r • raveiOAb 


10.120 : 


1.728 . 


11 


^ - 

0.8018 


83 


' ... V mvelOS ^ ' 


7.706 • 


0.737 


8 . 


0.5470 


•85 


^ ' ' ' ' . rave2bla ^ - 


■■ ':. 9.474 , 


0.973 


10 


■0.5254 


76 


. ^. ;f'::mve2.0i6 ^ 


8 .907 


. 1-325 


10 


0.4951 


86 . • 


irnve20fa 


16. 186 


2.934 


20 


0.6753 . 


• 97 


' ^ ^ mve202b 


9.720 


0.634 . 


10 


0.3573 


. 82 


rave.204 


^. 8.557 


l.£81 


■ . • 10 


0.6253 


88 ; . 


^, mye2B5a 


., 6.767 


1.558 


.9 


0.3470 


90 ; 


. • : , : ; . nn^e2()5b 


8,110 


1.736 


10 


0.5457 


82 


: * rave266a 


12.038 


1.574 


13 


0.6942 


78 


• , : rave2D6b- 


15.250 


1.619 


17 • 


0.4259 


80 


rave2b6c 


19.257 • 

* 


1.151 


20 : 


0.4841 


70 


* rave207 : 


3.761 


1.124 


5 


0.3287 


88 


mveSQl 


8.727 


1.501, 


10 


0.5635 


.77 


mve303 

4 


17.380 


2.257 


20 


0.5824 


'71 


'/ ^ rave304 


9.209 


1.366 


10 


0.6771 


67 


inve305 


7.458 


: 0.934 


8 


0.4806 


72 


nive3b7 


14-683 


1.522 


' 16 


0.5101 


63 


; mve308 f' 


. 9.Q.37 


1.170 


10 


0.4045' 


.82 . 


* : tnve40l 


9.254 


1.015 


10 


0.3673 


63 


mve4d2 


14.138 


2.335 


17 ' 


0.5988 


94 


mve4D3 ' 


%095. 


2.487 


>ib 


0.8340 


84 


' ' mve404 


4.254 


0.876 


5 


0.2166 


67 


mve4d5a ; 


9.169 i 


.1.069 


.10 


' 0.3701 


71 


• •mve4D5b 


8.329 


1.991 


10 


0.7208 


70 


mve405c 


9.087 ' 


1.222 


10 


.0.4934 


69 
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In ciassicai test v1:hedry,S'^^^f^^^^^ is always smaller 

or equal to ttie otHer reliablltj^ apprbxiinatidns ,suGh ais^ arid 
ach's coif f tcient i5( V ; Both<^ 20 ah<i<^ 2 i become equal/ bhly i^ all 



are of equal difficulty (or have equal mean if the scores are 
tdmdiis, and'tlbte that ct2Q wbuid be lised itl place df^2i-^ith a 
und biSomial model) . , Coef becomes equal td^2d ^^ ^^ 

in a test are parallel, tha^ is, all items have the same mean 
s and variances in classical test theory- ' As we previously noted 
is chapter , the- binomial errdr model assuiiies a single common factor 
B not cbhcerhed with d iff erehciating amdrig item characteristics, 
odel does not require any information about the item 
ctersitics in a test, such as difficulty and discriminating index, 
t ddes require kridwledge df the number of itemi on a test. It is 

-■ . -■ ' ■ '-- - - ■ - - - - V-' ■■ 

estihg to IK) te that the matliematically derived ratio of ''the true 

bserved score variances in the model becomes equal to the 

bility of the test where all items are of equal difficulty and 

ci]Ce. Therefore the de:^nition of reiiafeility in classical test , . 

y Idsis ah fctetestihg f eature iri terms , of a traditional sense 

se in the binomial error mbdeLi the value of the reliability itidex 

iuced to that o£ the lowest approximation to the ratio of the true 

^served score variances in classical test theory. SinQ^e^^ 2I ^ 

al case of rfeliability apprdxlmatidns when . item differences are 

edi it ,is . exactly vrtiat we can expect dut bf the bihbmial model. 

: A ■ - • . - - , 

The conceptualization of reliability is no longer Impbrtaht in the 

i_ 

i Instead, the accuracy of judging non-mastery and mastery status 

» . - ■, fc ■. ^ ' ■ . 

aminees ^becomes a main concerni Milljnan states this purpose of CRT 
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ci^iiriy: 15. his ■ ^aper ( 1 975 ) i arid discusses : how iiariy i terns jnust be ; . 
administered from a given item^pool so that the test items In the ddmairi 
answered jco rr ec t ^ari giv e[^ ac cur ate es t ima t ion of an ejcaninee' s t,rae 



..ability ■ ' / . " • * ' .• .•■ ^ 

Setting of Mastery Levels : 

The mastery level of Master Validation Exams (MVE) of the 37 
lessons in the Ctiariute PLATO AFB CBE Program .was set at a level of 80%^ 
^ although it ist impbss3:bie to prove that 80% i$ the mbst^ ^ppropiate level 
for ^the it prbgrami Block (1972). showed in fxis experimental study that , 
attainment df a 95% mastery level maximized Itudent learning of " 

cbgriitive tasks . in his matrix algebra coiirise, while an 8S%*level 

' ' • . ■ ______ _ *_ , . ' ' . 

maximized learning as characterized by af fee tive ^criteria. 

Since GSanute's 37 lessons are designed tb.be "hbmbgehebiis" with 

respect to content and teaching style, ail iessbfis are written under the 

same principle with the same tutorial logic, although the. subject matter in 

each lesson is different. Therefore Chanute's lessons, are not linearly 

' related and the content difficillty of the lessons is hot hierarchically ' 

ordered as it would be in teaching mathematics, arithmi^tic, or foreign 

iatiguages. If the lessons are linearly related, setting mastery level 

fbr^-t^e earlier instructional units should by higher than those of the 

later instructional units. If the goal of the ^second unit is the 

attainment an 85% mastery level, then the mastery level of the first unit 

might be 90%, or some other level higher than 85%. Since there is nb . 

. analytical technique Co provide the optimal levej of mastery learning, 

K . : 24 
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jiefiniteS statments about t|;i^ determination of ideal mastery levels Qarihpi: 

be made. at this timei Linri (1978) provides am excellent discussion 

of t he topic of* "setting s tan dards"* __ 1 . ^ 



vCutoffa 
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Mastery levels are usualiy set by instructors or the author of a 

. V ■ • ' _ . • -- ■ ■ 

■ ■ • " , ' ■ , ," ■ ■' _ _ _ * 

lesson^ but the decision of mastery and jibn-inastery is based oh examinees' 
observed test scores* The score that is used as to decide mastery ahdy/hbh- 
mastery* is called the "cutoff i" ifastery and non-mastery status ought tb be 
defined on the basis of true ability 9, not observed test scores x that are 
subject to measurement errors. If true ability were known, there would be 
no incorrect classificatiotis • Urifbrtunately^ true^ scores are' M|)bssible 
to dbtairi in practice, so we have to find a way to minimize 
tnisclassif icatiani. - . ; . 

There are fbur kinds of classif icatTidtisJ 1. an examinee's true 
ability @ and observed score x are both higher than a given mastery level 



and cutoff score*c, that is A = { x^fcc and }; 2. 9 is lower than Qq 



arid X' is also lower than c, that is B = < x<c and ;8<9o >i3> 0 is lower-than 
©p^ but X is larger than c, '^4. f < xic and ©<9o >; ^* is higher than Bq^ 
but X is lower than c, F- { x<c and Q^Sq >. The fbllbwirig figure shows v- 

these four conditions. . 

jUm . ' ■■ ' .. ' . __ \**- 

§ = true abiiityi x = observed score;^ 

®o= true mastery level 
c^ oT^serVed dutdff- ' , 

Probability of these events wilt be denoted 
V - by P(A) ,P(B) ,P(F+) and P.(F_) respedtively 



F- ' 


A 


B 





Figure 1 



v 



Millman (1975)^ arid then Nbvick S Lewis (1975i^ reported perGerit of 
students ekpec ted to be misclasslfied: for a given. cUtbff with varibiis ; 

- niimbers^ o f— te^ - i t^ms«— 4^11 maj^^s^ 

: and Lewis used the Bay.esiari ,beta binomial error model^ 

According to Millman' s calculations * the percent bt'studerits expected 
. to be misclassified at 80% mastery level using a 10 item t^est could be as" 
high as 53%i - ^ . 

Emerick (1972) arid Huyrih (1976) considered the idss ratito ^ of F- to 
F+ as a means of cbhtrbllihg misclassificatibtij especially fals^ 
advancement i if, later instructibnal units require the knowledge ^nd skill 
acquired in earlier units , false advancement will. ba a problem. 
Since F- stands for the event in. which a' student has really mastered 
the given instructional unit but his/her observed score happens to 
lower than the cutoff, retaining such a student in the same unit is not 
efficient, if the iistructiohal units are fairly independent from one 
th another, as are lessons in the Vehicle Training Program at Chanute 
^ Air Force Basg>s^lfen an appropriate loss ratio would be i, or it least 
it is riot riecessary to set .it as high as 10* 

Huynh (1976) prbposed ari evaluation of the cutoff ^core that minimizes 

<t ■ ■'_ 1 ' . _ ' 

, the occurence of misclassificatibris fbr a given loss ratio. With his 
cutoff score, the^^ loss ratio associated with the probability of having 
the false pdsitive to that of false negative stays the same, say 16^ 
whi^e the linear comb ihat ion of the probabilities of the both events and 
the' loss ratio (the average Ibss]^ is minimized. We will discuss in- more 
detail Huynh' s method in conjtmction with 34 Chanute lessons arid their 
MVE t^st scores i # ' * ; 
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3*2 EValiiatiem-gf -b ii e optimal cutoff sbores . 



Huynh d^riv^ the\optimal cutoff Cq of a test for a given mastery. 



Y l evel Oq ,an<i 



\ by differeri 



loss ratio .Q so as to minimize the average loss fuuctibn RCc').;, ^- 
iktitig it, where R(c) is the linear cbinBinatlon of the prcfbability 



af false positive* and false negative and is^ given by' 



R(c) = P(F+) + Q P(F-)< 



• r 



Co is tKe smallest integer such that tJie incomplete beta funct^.on of 



ig (a+Co,n+bi.Co 



is smaller than or equal to Q/(1-RJ) | where 

■ ^ , ■ ; 



p(co) 1= I§(a+Co, ti+b-Co) = 



@a+Cp- 1 ( 1-9 )tri-b-Co-i 
B(a+c5, n+b-Co)*' 



d@ 



in order to apply Huyhh's result to evaluate Cq, we tieed the help. cFf ' a . , 

. — - L- ' • ' ' - ■- -\ ■- ^ - " 

computer to calculate the values of the incbmplete l^eta fiihctibti for 
c'=*bi 1,2, ;^ .n and plot, them on paper* ^e itATO s^te.m eases these steps ^ gdl 
w^' can obtain the aiswer through the program "cutoff" written by the 
present author and T. Weaver. Figure 2- illustrates; the procedure to 
determine the bptimal cutoff Cq« The parameters a and b are obtained, 
fr^ the mean, standard deviation of the ti^t and the /number of ^ items in 
theitest (denoted by n). - Table 4 shoWs^the values of incQmplete beta 

i . ' ' ' - '/ • . ' ' . ~ 

furiCjtiori Sit eacK point 1=1, 2v>^^^^^ are calculated from V 

, test iSCXDres bf l'IVE201a t)y the formula, 
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I * 7 



40,0 



0. 8 



- 5^0 

-3.50 

-3.10= 

■5.50 



0 



-1;50 



*, w 



-0.50 



0 
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- 1 . 



NEXT for- 15=0. 85 OR type U | 
HELP /to "calculate p by k 



Detenatnlflg the optlnai cutoff vjGg 'a 







ij^A as to miniinize misciasslflcatloh 



:i9MQtt^ " HyE201a. 
9.4737 

r 

8 



" - 8i5569 



subjects ■ 76- 



X 



21:« 



10 

0.53 
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Table 4 

...» 

Ten points in Figure ,2 



©o = .80, Test=mve201a, a=8. 5560 /b=0. 4753 



item 


afi 


n+b-i 


1 


9i556 


9;47§- : 


2 


10.556 


8.475 - 


3 


1 1.556 


7.475 


4 


12.556 


6.475" . 


5 . 


'1-3.556- 


5.475 : 


6 


14.556 


4.475 




15.556 


3.475 


8 


:f6.556 


2.475 


9. 


17i556 


1.475 


1.0 


18^556 


0.475 




The curve in Figure 2 Is obtalired by i^lotting the points in Table 4. 
The horizontal lines which are marked by losses 0-5, 1, 1-5, 2,.. .,20 in 
Figure 2 help to evaluate the optlm'al cutoff which ^minitnizes, the average . 
loss A{c) at cn fcr the partially known loss- ratio Q and a given mastery 
true, level 'Qq- Since the contents ^of all lessons disctjssed in the 
ehanute ELATO^AFB CBE Program deal with independent topics acrass the 
irasons and the Wessons are' hbt'J linearly o"r hierarchically telatet, a 
loss ratio of 1 wiii be reasonable. Thus^ in Figure 2 the smallest 
.integ!^- value of i for which the curve P(i) goes under th^ l-ihe of loss 
ratio 1 %3 7. Therefore Co«7 is the ideal cutoff score of. the i- 
test, MVE201a. . - - — e 

it is interesting to iKite that 4:he fcutbff scdrei, c^8, actually used 
for MVE2bia iii" the Chanute train ing_j)rograi(3tt gives a slightly larger va^ue 
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of the probability of misciassification of (R(c)=P(F+)+P(F-) j than the 
thebxetically d^r^yed Cq does, but nbt for P(F+), probability of false, 
positive/ or P(F-)> probability of false negative separately. 
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P(F+) ri^^(aib)-(i/B(a,b)) S JB(a+i,b+n-i)Ig (af i,b+n-i) 



c-l/ri 



P(F.) .(1/B(a,b)) 2^1^ jB(a+l,ri+b^i ^ 



• The probabilities of P(A)=Prob{Si#o,xic} and P(B)=Prob<e^<:@^o^x<c} 
are given respectively by the following formulas: 

\ . ■ 1 ^ c-l.n. , _ \ 

P(A) = l-Ie_(aib)+a/B(a,b)) T B(a+-iirri-b-i) (IeQ(a+i,b+n-i)-l) 
: ' : i=OSi' 



"PtB) = (1/-B(a,b)) 2 (,lB(ari-i,b+n-i)I@ (a+i,b+n-i) 



The probability of each misSassif icatibn for dl'l available MVEs 
■ were calculated and summerized in Table 5. ' , - " 

Since the sum of the probabiiities A, B, F+, and F- is i, the sum of 



the prob^iiities of A arid B must have a max irainn value at where 

_ ■ ■ _ _ ' _ _ .* ' . , _ I : _ . 

P(F+)+P(F-) reaches the miniffium as shown i|i Figure 3i 

■ • _ J- ._ ' ' •/ " ' ' . _ _ 

In Figure 3^ the curve of F(F+)+ P(F^)(the lower curve drawn * is) 

decreases slowly until it reaches the bottom at Co> then ihcreases as 

the number of items increases, ;^iie; the curve of P(A)+ P (fi) (the upper 

curve drawn with ^^^^^ reaches; the maximum' point at Cq» > - 



Istiiated Probability ol.Misclassifications 
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mve204 
mve205a 



inve285b 
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-Success^ 



a 



0.0621 
0.0314 



Pjf.) P{P^ or F,) P(A or F^) ;rate 



0.0162 0.0783 9^9247 
0. 0639 0.095 3 0 . 84_62. 



0.0026 0.0001 0.0026 0.9997 
0.0011 O.QQg? 0;06 m ^d. Sa2T 



C 9 
0 



0.0348 0.0259 .0.0606 
n nm_ ^,^0259 0.0606 



0.8705 
0.8m 



Cq 6 



0.0235 ^ :O;0094 0.0329 
0 . 0123 0.0399 0 . 0 522 



0.9739 

0i9323 



^8 



0.0357 
0.0238 



0.0064 
0.0262 



9.0421 
0.0499 



'0.9788 
0^9472 



c 



So 



^0 



0.1078 

)718 
0.1163 

0^1 

0.0055 
0.0031 
0.,0996 
- 0.6996 
0.1428 
0.1428 



0.0146 0.1223 
Q.0g56 6.1266 
0.0624 0.1788 
1624 . 0.1788 
0,0001 

0.0122 o . o m 



0.9375 
0.8598 



a 

10 

JL 
12 
44 



0.1607 

0.0478 
0.0266 

9 i 0606 
4.0305 



0.0503 
O.Cfe03 
0.1341 

9^1341: 



0.1499 
0.U99- 



0.6495 
0.6495 
0.9998 
0.9853 
0.7803 
b.78QS 



0.2769 
0.2769 



0.3612 



0.0634 
0.0634 



0.0184 

0.0535^ 
0.0113 
0.0911 



0.2141 
0.2141 
0.0662 



O.d&Ql 



0.0719 
0.1216 



0.6913 
6.6913 
0.9207 
J? . 8644 
0.9708 
0.8608 



13 
c"" 16 



0.0057 
0.0030 



0.9003 
4.0116 



0.0061 
0.0146 



0.9991 
0.985^2 



1p I 



0.0965 
0.287§ 



0.1957 0.2922 



0.: 

Oil 



30 



.94 



.86 



.88 



.90 



.72 



.82 



.82 



.95 



(91 



Tattle 5 (cont;) 



1 



Cutoff 



a 



Success 



P(Ft) , P(F-) P(Fi. or P^j P(a; or Pj rate^ 



jbvelOl. 



mve3Q3 



c 8. 



6.0894 9.0540 0.1434 

-070894^- a^546""0^434- 



•0.8184 

078184- 



.79. 



Cx 15 
c^ 16 



0.1070 0.026:6 Q; 13,35 0.8867 
6.0730 0.0653 0.1383 .0.8140 



90 



%■, 

■4 , 



inve304 



c" 8 



0.0471 0.0292 0/0763 •0.-jB922 
0.0471 0.0292 -0.0743^ &'.8922 



.82 



mve305" 



c '5 

0 i 

c 7 



O;0632' '0.0036 :0,-Q668 
0.0247 0.0787 ..Qi.'' 



0.9827 
0.8691 



.96 



rnve307 



Cpj 11 
c^ 12 



0;0526 0.^0056 QMB^ 0.9797 
Q . 0413 -t) ; 0187 -"OT060O- 0 . 955 3 



.81 



mve308 



c? 8 



0.0732- I 0.0147 0.0'88O : 0.9601 
4 ). 0 4 98 — 0.0578 Q . lon 0.8936 



.5 a- 



mve401 



7 

c° 8 



0.0364 .:^).0109 0.0473 
0.0252' 6.04 O.J7b4 



0.9872 
0.9328 



.■;r 



^:83 



0 il494 0 . 0395. 0 .^890. 0.7809 
0.0910 0.0961 G.187ii 0.^6660 



inve402 



.79 



mve403 



inve404 



m\7e405a 



0.0771 • 0.0294'. 0 
6; 0771 Q:.0294X- -0 



.1065 ' . .0,^048 
.1065 9^^048 



.79^ 



,c 4 



0.2100 0.0130 0.2230 ■ 0.9564 

0 . 14S5- - 4^ 0840 0:2296 : 0.8 208- 



1.0Q 



c" 8 



0.0560 0.;0025 '^O.OSSS 
0.0326 0.0513 0, 



0.9919 

am96i^ 



i.OiO 



inve405b 



nive405c. 



1 



c^. b 
c^ 8 



0.0987 0.0419 0.1405. 0.7344 
0.0987 .0.0419 O.II05 ■ 0.7344' 



. 91' 



7 



0.0794 

O;0527 - 



0.0123 0 



.0917 
.1005 



0.^9543 
0.8921 
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:./- c- is the theoreticallv dexived 

1 ■ ° ' ■- ' - ■ ^ 

p|F-b c is the mtpff actually u 

at Chanute, 

■ : ■ ■ -A- ■ ■ . ■_ 



cutoff to iriihiinize 
iii the p£fiTO Service. 



.7 • 



0.6 



0.2 - 



0^0 + 
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' : ' Figure 3 ' ' ' ' ' . | ' 
Graph of P^F+J + P(F-) oyer. cutoff scdrbs 



leisoii » M^201a 



n « 10 
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' . Tible 5 InddLcates :that the actually used ctitdff scotes t:prodace : • V 
highef probabilities of P(F+ or F-) than the theoretically determined |^ 
cutoff CqS eixcapt in. a few cases;. Since 'the thebretical cutoffs are , * 

of probabiiiti'es of false negative F- and false ^ost^ive Frf, all va.ues ^^ : 
•: in cpiumn 5 of Table 5^ P(F+)+ ]p(F^ have smaller values tpr Cq than for- 
;g. The sum of the vprobabiiity of A and F+ is the expected success rate 

I this sum matchrs the observ^ success rate given in the last eolumh 

' • ' * -. ■ * - ■ -' ' , • ' . ' ' ' 

fairly w^lli- - * ' '. ■■ 'j/' •' . ' . _^ • 

The pr.bbability of each inisclassif ication for all available lIVEs i- - . 

; wiere cal^fit^ted and summer ized in Table ^ ^ - ^ _ • 

> Since, the sum bf the probabilities F+j^. arid F- is I, . the sum. of 

- -•• ■ . V ; . : ; , , • - ^ • 

" ■- ■ ■ '' ; ... ■ ■■ _ . ' _ _ " * \t I ' 

\ the probabilities of A' and B must ; have a maximuii value at CQ-i^ere ' 

- ■ - ■ ' - ' :■ " _ _ . - : . _ _. . ; _ 

P(F+)+P(F--) reaches 'the minimum as shown In Figure 3# ^ , 

In Figure 3, the^ curve of $(F+)+ P(F-)(the lower curve drawn * is) 

. decreases slowly 'until it reaches the bottom 'at Cqj th^n Increases .as. 

the number of items increases while the curve of -P(A)+ P(B):( the upper - 

curve drawn with- + is) reaches the Maximum point at Cq* ' 

If were used as cutoffs for HVE test scores^ only 12 lessons would'^. 

not have a probability of "observed success less than which was psed; 

as the lesson validation, criterion in the PLATO Atfi CBE program, while : 

-18 lessons have values in P(A)+P (F+H±-e. p(x^c)) when q's are used. 

Since the probability of false negative, P(F-) stands* for the case • 

- that an eximinee really mastered the goal of instructional unit but his/her 

observed score happened to be lower th^n the used cutoff c, he/shfe does not 

; have to rep&^' the instruc>^on. If efficiency of ^a;Uiirig in terms of 

' . ■ ■ ' ' ■ ■ ■ ' ' 

.9i 




siiqrtening - the tr^tiirig: t ime i6 ^fibuld riot be - 

sb large* - For eikample ^ ilWE207 has P(F--)= ^jii957 which mean's : • , " ' " ' • 
88x0. 1957»I7V the safae , . 

±Si t ? u c t ±0 n ^niS 

values are less v than 16%, which meana. that fiVe tp eight studerits . ' ^: 

• repeated lessop ^tnistakenly i Table 6 shaws the ^nanber ^ of, ^ 
stud^rifcszmisc;^ in Master Validation Ex Since the oliserved 

^ . ctjtpff^'c f pr ail HVEs but MVE 207 are. larger: than "or equal to the * / v 

- opt iinum cut-of f Cq- the number of raisclassif i^d stTudents of -the. type 

• ^ bkcomes larger for iising^o than ci and errors in the typ^e F- turn to be ^ 



J^maller ...fotj?^^.^ total "nisclassificatibns are tninitnized by using . 

^ vc^'. It' is a problem of the tradeoff how the ciitoff be selected; • 

Since: th^-"idss tatio of 1 was selected in our study, .we cohciude 
/that most cutoff s at-ehanute were riot 

. 'the .best choice. By adopting the theoretically derived cutdff- C5's the' 
u. p^b^biiity of rhisclassificatforis^could have been rairiim 

The probabilities of success rate ''by observa.tiDni prot{x^c)^ or- 

pr6b(A :or F+) , sugg^fe'st that phe validation criterion of "lessons in th'e 

• ■•■ ' : : V, : ■ . . ^ ' } : v-' . ■ . :, V . r 

• ' Chahute progtani must be changed v ; Twelve 

passing probability of less than .90, ^ven .if t^ie theoret ical cutoff Cq 
r ; had been adopted instead of t^e :^actualiy used cutoff score c. '.Thoke 
lessons ;;Which ha57e failed apparently need more/at tention.,fron. the 
instructibrial de*5ign-^ts.>. butv^t' tine their te^sts :need to be • - 

-reviewed too. becaus^ we dori't knbV/the caus»e of .rni^classif 'icat-ions, -'in- a/ 

' test. The investigation along ^his line J7ill' be t^ken iii the next, sect.idfti, 

- ^ . ■ .' : , ' V , . ■ *■ .1 ' : ' ■ . * 

: it should be notid that the apt ted cu^ 
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Table 6 



Bstimated Number of Hisclasgif led Students 



Test; 



eutbff 



Test 



Cutoff 



invel03 


G — 


6 

/ — 


. 5.3 

— ^ • / — 


1.4 

c - A 
J • *f 


mve207 


1° 


• 5 
4 


8.5- 

25 3 


• 17.? 
4.8 


invel04.a 


— * 


.7 

ib 


0.2 

b.i 


0.0 
0.5 


mve30i 


c 


8 
B 


6.9 

£.-9 


4.2 
. 4.2 


invel04b 




9 


2.9 
2.9 


2.1 

2 1 


mve303 




15 
16 


7.6 
5.2 


1.9 














mvelOS 


|o 

w 




2.0 


-0.8 . 

3-4 


mve3:04 


■ c° 


8 
8 


3.2 
3.2 


2.0 
'2.0 


nive201a 

s 

1 


C 


■ ..7:::. 

A O 


2,7 


.... ..^^^XJ.. 


mve305 


■ p 


5' 
7 


' 4.5 
1 8 


0.3 
5.7 


ntve201b 




' 7 
p 


9.3 

"6 1 


1> 3 
4 8 


mve307 




11- 




9.4 
t.2 


'jnve2t)2a 




16 
16 


11.3 
11.3 


■ 6,1 

6;1' 


mve308 . 




■«'7 




. 1.2' 
• 4.7 
















mve2Q2b , 


- V- 


5 

Q 
o 


0.5 


' 0.0 


, mve401 




7 


t 2.3:' 

■ ■ iki:' 


0.7 

' ,2;8 


mve204- 

- 


> 


8 


: 8.8 
8.8 


4.4 

A A ■ 

4.4- 


mve402 

, ■ .■■ 


•■Jo" 

c 


. 13 
14 


lf*^0 
8.5; 


3.7 


i|ive205a 






12.9 


12.1 
12.1 : 


mve403 




8 


6.5 
6.5 


2.5 

: 2;5 


nive205b 




8 

. 8 


12.4 
12.4 . 


5.2 

^ 5.2 • 


ffive404 




/3 14.1 


0.9 
5.6 


mve206a 




10 ' 


3.7 

J-.l - 


1.4 
4.2 


. inve405.a 




6 

^■;:'8^ 


:4:0 
. 2.3 


0.2 

V 3.6 


ihve206b 




12 ' 


4.8 
2.4, 


- 0.9 
7.3 


■ mve405b 




" 8 
8 


6.9 


. 2.9 
2i9 


mve206c 


C 


13 
16 


0.4 
0.2 


0.0 
0.8- 


nive405C|_ 






■ 5.5 
3 .6 


0.8 

3.3 

















a 



cv is the theoretically derived cutoff to ,ininimi2:e ' P (F^) • + P^I?'.^'^ 
c is the cutoff actually used in the PLATO Service Program at Chanute. 
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^bwly for the smaller K/ values '^No . of items in a test) but starts . 
^foppiiig rapidfy until K reaches K=9 and agiin slows down. The shape of 
thq curves varies a quite bit among MVEs aiid some start dropping rajSidly 
at around K=7 or 8 for 80% true mastery ley el. Thus , the loss ratios of 
8 and 20 can have the same opetmai cutoff for the same true mastery 
level. This is due to that the beta binomial model deals with 

continuous scores *while'the real data are discretes 

•• ■ . '■- - ■ ' ' ' ■ . ' _ d • . 
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VALIDATION OF LESSONS AND CRITERION REFERENCED TESTS 



.0 . 



$>1 Predicting the Percentage of Success Rate for the Lesson 

^ ■ , ■ ■ , • 

fable 7 shows the estimated prdbabiiity of success irt teorms of 
the' proportions of true score to the number of test itemS^ or tru^ ability 
level 9. These calculations are based on error Tree true ability level 9 
: > so it is more reiiabie compared to the values obtained" in Table 2.^ 
- where, values were caletilated from th^ observed scores. 

. Since P(-rr>.9), thp probability of 90% of the_examlnees achieving 
mastery, was based on the observed success rate and sample size, their 
Values dbn/^t. reflect the infpjmation from tests,' such as test lerigth, 

21 > mean and standard deviation of a testi ' 
However, the probability P-(F+ . or A) is derived from unique information 
obtained "from each test; hence we can consider' it more accurate than 
PCtt ^•9)» * The lessons which have values -larger than -SiS for ^ 
%mV^K or F^) and- P(A or f+) might not require any further • / 

; revision bu% others mightrneed it . Lessons 105 and 308 probably ^ 
\^n't, require any further r^evision, but 204,207^303,304,402, and 405b 
might need revision lesson or tests in spite of not being recommended 
accdtdirig* to the validation triterlbn that*has been used in Chanute 
program^ The probability- of PASS based on the observed scores tends to" 
provide larger Values, so that the validation critetibn based on the 
^rpbability of true ability level P(A or F --)(i.e. p(9^g^j) will be more ^ 
5 plausible ^standards. .• ' ^ • : 

It is, important! to rK)te that these lessons may not' reaiJ^r need 
revision; instead, the result may be due to poor test conatructioni So 
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Vaiirfntion Crltiaria 
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.P7 










•.Of. 








."7 












>n 


.A3 








.07 




revision '^ccpnnori'^O'^^ (^^^•) 


2n7n 






.10 






02h 




. 9 " 






2n/, 


73** 


.7?^ ** 








OSh 


. 3S** 


.36 ** 


.Oh 








.60 ** 


.10 










.02 ' 


•^3 




2 






. ."7 










no 


ho 

• 


.Uv 






Al** 


.31 ** 


.A7. 


— . ^'S^ 


. 301 ; . , 


7^^"c* 


4^? ** 


h 


P.M.' ■ ' 


' 30T ' . 




''. ro ^* 
















305 ^ . 




. ' ■ 


.O^i 




107 - 




.0)1 


0 




^ 30r , 


on 


.oi^. : 


a 


j?_,_rL_ 'J__.^^:^ _p__;. 




, 0«s 


.OP' 


.0-' 


' ■ . ' * . . • ■ 




^7** 


. 7« ** 


.0.1 










- .01 




40/» ' ; 


7f»** 










0/, 








VOSK 






\ . ./.I ^ 




i05c - 
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far, the only available technique to measure the qtialit^ of less-ons^ is 

to examine the result a CRT given at the end of the lesson. If ;.th§ ' 

-fiest is^Cdnstructed^ with or F.). = v2992i _ 

^oi=.3287)i then the measure will be unfair to quSfetiorf the .quality of . \i . 

the iessbn. The measure does not distinguish between the = test and the 

lesson. Thus, the faulty part ma^,b§ tlie test and/or- any ijtHer part or 

parts of the lesson. This argument/can also be applied te | the reverse ^ ^ ' 

situation. Therefore, construction of, a good test will be^ a foey poipt ^ ^ 

in judging the quality of a lesson that- wiii be tedirecily measured by ' . _ ' 

this test. : . " . _ ' : ■ 

' V *- ■ " '■- ■ * 

A^-Validation of Mastery Validation Exams 

« ■ 

In the previous chapter, we disc.ussed the- optimal cutoff Cq of H CRT \^ 
with respect to Mastery Validation Exams in the ItATG AFB fflE Program at 
Chahute Air Force Base:. . - ^ ^ 

The evaluation study of the program, suppdrted by Advanced Research 
Program Agincy, measured some criterion variables which ^wbtild be 
helpful l;n conduct ing a yaiidat ioh study of MVEs'. The ;evaluatidri study 
revealed that a substantial, ntiber of examinees were miclassified(Table^ 
6)i Since detailed information on the deiign used in the evaluation 
study can be found in Dallman et al. (1977), jus^t a brief description 

wiii .be given herev - • ' ■ 

' ■' ' • ■ . ' 

• A Sb-item mT was given at the beginning and end df ithe eight-week 

. ' , ■ ■ • ■ ' * _ __ _ _ * _ . 

PLATO AFB Program, which included 37 dh-lihe leisdris. 'The 37 lessdtis 

were divided intp four subsets called Blockl^ Blbck2i .BlockS^ and iBldck4: 



\^trer a studettt stadied aAd mastered aii. lessons iit a ^ioqk, he took the 

■ _ ■■--"u :_;:• ■->■.: . 

' biocic test; the Block test score was corfited' ia "his. final'^rade for the 
course ."^^te had to take all-^fbur block te^ts, and thett a pdsttest was giv-eyi 
ifi brdgr to neasuire the effectiveness of the program. Each block^test had/ 

" twenty iteins' which were either multiple choTce or* matching. The ' 

*• . , ' * ' 

ct)eeficient alpha reliabilities were not calculated because the te^ts - . . 
were i^rittLen on the PLATO system ^and the item information was npt 
collected^ *ButC3t21 was available in the: fbUdwihg chart • . Figur-e 4 , 
gives a flow chart of the testing program/ . ' ^ ^ * 

in order to' vMidate the effectiveness of lessons^ four kinds of 

, correlations wete carculated.. These corfelatidtis ar^^'de^ycribed in>. the 

f.-^ • * * : / • • _ 

fbllbwirig paragraphs. ... * 

Each Block's test scores were matched with the corresponding Master 
' Vadidation Exam scores and ^he time needed to master the lesson (mastery 

• ; ' ■ - ' ■ y ■ ' ' ' ■ ' ■ ■ 

time), and their correlations were calculated over the subjectsi These 
two correlation values of 27 lessons were denoted by r(B,MVEs) and 
r(B, time) respectively. Their Values are ; shb\m in Table 8. 

The true -gain scores bf post test > from pretest ^ were 
estimated by multiple regression procedure; the trqe score difference ' 
t:2-ti of the observed score difference x2-xi was regressed on the post- 
, and protest scbresi^ It is known that the regtesslon of t2-ti 'onto the 
tf?d variables xj and xg ate the same as regr essi^^t2'"t i bh the scores 
X2-XX and the residual sdore, C2 , of X2 ,bh k2~xi ' fTatsubka i 19^^ 
because the covariance of x2-xj^ ar^ C2 equals zero and both X2-"Xi ^ 
and C2 are linear combinations^ of xi and X2i' ^ 

Therefore, the multiple reigression RC'tg-t j j x^-^xj ) will be given : 

" '■■■[• ■ • :., 

• • '•• - . • ■ " 

■ ■' ■ ■• ■ • ■ <> ^ '-is 



PRETEST : 50 ITEM NOR&IED REFERENCED TEST, COER a = 0.40^^ 







I CESSON 164FI 
; I M V E 1 0 4 0 / h 



ltESS0N.J046| 

T 



.ESSONS IN BtOCK 1 
103, 1040, I04b, 105 



[ UESSON 105 1 
MyiVE t05^ H 



■BLbCk TESt \'\ 20 ITEM TEST, (Jg, = 0-.?6 



I LESS ON 2010 

MVE aolo 



I LESSON 20T^ 

— 3"-^ 



WVE 207 
■ 1 



LESSONS IN QLOCK Z 

2010 , 20L6 ^ 202b, 204, 205o« 
2056 , 2060 , 2066 , 206c,207 , 



BLOCK TEST 2: 20 ITEM TEST, Ozj =-0^3_ 



I LESSON 5 0I— I 



, ' ImVE 301 I —* 

: ■ .r_ 

V -t ^ 



Fle sson 3 eB- 



MV E 308 



LESSONS IN BLOCK 3 
* 301 303 , 304 , 305 r 307 , 308 



BLOCK TEST 3: 20-l T E M TEST, 02! = 0:47v : 



LESS ON 401 

3^ 



MVE 40i- 



I LESS ON 405c | 



rMV^^05c== 



LESSONS IN BLOCK 4 

40l,4p?i:;;403,404, 
4bSo ', 405b, 405c 



5T 4: 2b ITEM Test, dai = 0.42 



I 



POST TEST : THE S AME TEST AS PRETEST. CO EF. a = 0.63 ^ 



:Flgiir-e'.4 

'Block 'diagrani of stydent flpw' through PLATO-bascci portion of 
Automotive Course * ' 
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as thl siiia of'the regressian of R( t2-ti| X2-xi) and R(t2-ti|c2). 
: P-(t2-tll.X2, XI) = R(t2-tl |x2-xi)+ R(t2-tiic2). 

fJbte that the regression coefficient of the first term is the 
relikbil'ity of gain scores and that of the second terra is the increment 
Vi^of multiple R?. The multiple R is •Sei, hence the /reliab il ity of the 

multiple reg-tession gain score is r2= .7405. The first term, .the simple 

■ . ,* - - - ' ■ ■ ' " ..- /' - . • • 

difference score has the reiiabiiity o'f' .1047 ^ the second term is 

.6358. * . : 

Thi^ estimated gain score has a, higher reliabili<ty than those of 

pretest and post test separately. This score was ^brrelated with ^IVE 

' ■ " _ ■ _ '_ \ . ' " ■ •* 

scores and mastery time, table 8' shows the result. 

■ . ■ • . ■ " ' 

The optimal cutoffs that were evaluated in the previous chapter 

were Jiiyided by nunbtir of items in tRe corresponding Master Validation ; 

•Exan. The same operation was used for the difference of the mean from 

■■' . " 

. ■ . , ■ • . . _' _ • ■ _ , _ 

the observed cutoff Cq in each HVE. This value expresses the distance 

' of Cq from the mean in each. test. The summary descriptioh; of these 

y:ariables and the correlation matrix are given in Table 9. ^ • ^ 

' ' ; ' ■ ' ■ ■ ' • " ■ ■ . ■ f 

• ■. ■ ^ ."^T\ . ________ ^ ^, _■ _ 

The probability of false positive (or advancement), F( F+) has 
^ correlation values of -.562, -.659^, .638 with 'nafter', (mean-Co)/n, apd- 
P(F-') (false n.^g.ativ^";br attainment,)^ rlespectively*. This means that 
. the niisciassificatL^^n of^alse. advanfceme|iit tends ^to occur nore bften^ 

when the W Cq iis* trlo?e.c., t^^^^ ■ - 

vadvances the stuS^ntsvto the next lesson inbre frequentj^ by mistake 
tends to retain 'the students whose true scores are really above the 
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The correlations of Biotk tests to MVE 



scores and mastery time 



. iessbn 



103 
104 a ■ 
1048 
105 



201a, 

201fi 

202a 

202b 

204 

2053 

"205b 

266a 

206b 

206c 

207 



301 

3(53 
v' 304 
•305 
307 
30S' 



401 

402 

403 

404 

405a 

405b 

4b5c 



r(B. HVEs) 



r(B| time) 



r(G, MVEs).) ^ 



r(G; tlm eju 



il5 
.36^ 



-.22 
-^.33 



* V 



.23 

.19 
.44^ 

.20 



-.38' 
-.43^ 



-.34* 



.34* 

.iSr 

.17 
26 
.21 
.28* 
.25 
.40* 



,12 




.12 
.25 
.04 
.03 
.21 

,24 

ids 

.21 
.04 
.04 
.17 



.44* 

.07 

.28* 

.11 



.18 
.15 

.13 
-.02 

;33^ 
. 25 



-.05 



-.40* 
-.43* 

-.07 
-. 13 

-'.32* 
-.26 



-.22. 
-.18 

-.08 
- . 27 



'I 



.04 ■ 
.34 
.38 , 
.07 
.30" 
.01 



.08- 

.21 

.27 

.19 

.23 



^.08 
.42* 
.31* 
.41* 
' . 00 



-.96 

-.05 

-.37 

-.26 

-.30^ 

^.07 



..50" 
.25 
.'40* 
J. 02 
' -,07 
.25 
.37* 



.15 
..14 
.23 
,00 
.01 
,06 
,11 



.32* 

,46*. 

.02 ' 
•12 
.17 
.19 



-.21 : 

-•34*., 

-.02 

-r.33* 

-.11 

-.07 



^significant at p < .05. 
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y . Table 9 o 

A Cbrrelatibri Matrix with Summary Description of Variables 

. i }. 

^ Description 

^ ■ — T- *• 



tor lab ie 



1: P(F4:) 

2 c /n • 

3 a21 

4 P(F^) +.P(F_) 

5 nafter 

'6 %faii ^ . ■ ■ 

7 P(tt > .9) 

8 range 

9 r(G, MVEs). 

10 r(G, time) 

11 r(Bi MVEs) 

12 r(B, Time) 

13 items 

14 mean - c . 



n 

15 P(F_) 
1 



faiie positive 

theoretical cutoff dilrided |by number of items 

the ratio of true variance ' to observed varianpe 

probability of misciassification \ , 

number of subjects using a lesson if ter it was ^ 
declared to be validated 

observed. percentage of failure in MVE 

Baysiari estimate of. success rate in the population*. 

maximum mastery time minus minimum mastery time 

correlation of gain to MVE Scores 

correlation of mastery time to gain ^. 

correlation of biocktest to MVE' scores 

correlation of' blbcktest to mastery tim 

number of items in a test 

■*f-.' . ■ ■ * 

relative cTlgtahce of c from the mean, c lobservecf 




false negative 
k ' 5 



1.000 

-.617 1.000 

.1.65 .335 1.000 

-.226 -;265 -.903- 

.345 -.304 .206 

5^.264 ' .271 .032 

.054 -.09? -.460 

-.102 . .125 .286. 

-.056 -.133 -.320 

,-;2ii .426 . .385 

. 855 ■. i-;48^ \ .408: 
869: -;544 f .293 



10 



11 



1.000 
-.113 
.053 
. 386 ' 
- . 368 
.355 
-.339 
-.356 
-.281 



1.000 ■ 
.-..074 1.000 . . : 
-.414 "t.377. -i;O0O ;' 
-.192 .403 -.275 i.DOO 
-. 120-,. -..235 ■ .520 -.468 
.070 ■ .',231 -.190 -.034 

.415 i.iig -.193 .l4l 

.417 -.196 -.171 .099 



Note. All correlation vklue^^^^^^^ transformed by Fishe^r's Z transformation. 
ProSaBill ties were trahsfbrined by '^in (v^) ; 
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(Tabi^ ii cont.) 



12 



13 



1$ 



15 



12 i.bbd ' 

13 -.159 i.bbb , 

14 T.228 -.119 -1.000 

is -.17^ -i264 .956 1.000 



\ 
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maitery ieyeli lh4"cort;eiat±Lon M with the ji/atiablei the number 

of studehts^hp stuciied a llssdn after the ^'alidatipn dite was set, (If 
over 90 J of students pass the mastery level bf a MVE, then the lesson, 
was said to be validated.) indicates that the prdbability P( F+) will be 
small if the lessons whose validation date were established at an 
earlier date during the period of evaluation study at TtMQ program. 

is relation is true^ fSr the variables PC F+ or F-) and P(Fr.) because 
the fcorrelations o.f variable nafter with them are -.617- and --.544 
respectively^ liore^ver, P(F+),P(F-) and F(F+ or ff) correlate Higl^ly with^. 
variable (mean-Co) /n with the values of -^659, -i855, and -•956 
respectively. * Bi^t jthe cbrfelatibhs between 'nafter'* and' (iiipan-Co)/ri is 
signific4nt, at -.489. Eence^ we canhbt state that lessons which were 
quickly ♦validated will produce le3S chance o? misclassificatibh. Since 
the cbrrelatibn of (me^-Co) /n and nafter is -V489i which is 
significantly high^ the cutoff Cq associated with some of these Msste^ 
Validatlbh Exams might have iiappehed tb be chosen closer to the means. of 
corresponding IWE exams respedtiy^ly. : 'riiis fact r,aises a . questibn abbiit 
the properness of the validation criterion that.^has been used iri PLATO; 
Service Program at Chanute • 

' ' S'itepwise :mult regression procedure was performed on the . ^ 
fifteen variables", and three predictors were . selected' tb predict the '! .■ 

waridfcle P(F4:or F-). Table 16 gives, a Nummary of the analysis. 



Table 10 . ; • . 

Estimatlori of P(F4.> +, P(Fi.5 by Stepwise Multi^^^ Regressi^.h 



If fc^ 



c oef f i c lent S*D» error 



t 



0*21 -.193 

liafter . -^205 

r(Gj time) •144 

(mean-Co) /h .829 



: .089 
• 102 



2^193* 
^ 2. 092: * 
vi: 1/618 • 

8.127 ** 



Hiltiple R » ^9101^ constant « .60> J3^23 ^ 30.305** . • ^ 
" ^significant at p<^Q5 ^at p<.01 " 

* The first predictor (meati-Co) i(n for the criterion- P(F4- o?'F-), 
variable 4 Has a b^ta coefficient of 0.792 arid sigltificance test of t- 
value 7«9. .This result is e:d)ected^ but entering as the second 

■-■^^ ■ ' ■ ■ ^ ■ -'-'^^ 

: predictor in . the analysis is surprising. If . is high enoughi then 

the probability of PCF^, or .F.), occurrence'of "misciassilic^a tion, will be 

■ ' - . ' . •■ ' ■ • --'■^;* ; ■ , ■ ■ • . 

- minimized; Ifost Master Validation Exams have? reliabilities of / ' 

around .4 to .5 >7hich is quite Ibw^ so it Is natural to expect that 

CTlfjciassificatiorts wiii occ^red quite frequently in. the program.. -^^ 

Tlie* variable!?^ 2£ do cotreiate signiEicStiy^^.v^ variable 13^ " 

miraber of items iii the Ytests; it correlates, with variable 6, percentage 

Of failure at the .5% significance level, mis relationship be- , - 

interesting td: investigate further ^> especially when the test lengths 

are short and^ about the "^ame contain Jhg^^ 15 itetas as is dustomary in. 

criterion-referenced testsi 



c 





m 




1 

\ 



4 • 



The following picture might help f qr ; 
quick, irituitfve grasp: of the relationship 
between F+, F-^^, variables c d, it^arid u^. • 
The dreas of marked Ff arid F- depefid dh. 
on ux-Cb^ n-Uxr • 
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Relationship between the optimal cutoff Cp/n and other variables .^^ 

■ variabie cb^f#t€ ient S « D « er «Hc > t _ ^ ; > 

^(?<2i " ^ ^ .256- -142 ' ■ 2*085 * : ' ' V -\ 

range . i ; ..583 : ^ : .141 . _ ^.135 - :v 

po. ortoras -•362^ - U39 ^ 2^604 * . v . : 



Multiple R 5= .7528^ , constant . 55 j.^- i^^ ^ 101027 
^significant at p<ib5 **at.p<idi . ' 

Table 11 gives the results of a ste|>wise multiple regression ; 
analysts ;^ere the criterio*ri is t cutoff. Cq divided by tii 

Entered predictors are variables^ 8, 13, and 3. t-tests of .significance 
for the beta coefficients indicate that all three variables are ' 
slgnlficaht at pi.05. ^'Since ^^^^^^^ the range of tiine(t^ 

differiirGe between the maximum time needed and the talnimum time)^ the 
longer the time span needed by studeht;s to master a lesson, the lar^r 
the ratio of the opt imim. .tutoff i to^^.v^ nombier of ; items will b^. It^, 
should be noted that the procedure of Evaluating the o]ptimai cutof f/c 
does not depend on , the .time needed to complete or master a lessons » 
;if . c/n.4s ^rela^ti^^^ higher , - then there is more failure^ bb^^ 
correct; failurJ; ;,B in Figure 5V^ Resulting ay larger range "i^ the^ftiastery 
time* of % lesson. is again among: :the pte|ictors aitd i£^ 21 is 

larger^ then c/n becomes mbri(^Jlf^e^ted^ by it*- . this analysis needs?'^. to, be 
more fef ined^ihce a better "way to interpret the results should be 
found • , ' ' 



, .> betweai r(Gi M\fes) arid' otltfer yar tabled ■ 

: vatlable. : beto cbeFficient S»D» error v^^ t » 
c/n -^336 .181:,; 1.856 x 

T>.9) ' .207 .190 , : 1.089 



rCe, IWEs) -.535 . .193 2.772 * 

Multiple R »'..5430 , cbristarit •= 0. 27, F3^23^ 3.206 * 

- ■ - '; - • . - - ■ ■ . , ' - - ■ - - • * 

*signif leant at p<.05 x sighif^-carit at p<.10 



4 Table 12 shows the results of a similar, analysis, using the 

correlation of gain scores and- Mastety Validation Exam scores as the 

criterion.' A larger value of this variable mean's that the gain score 

was ribri-itegligibly affected by -t Mastery Validaibri ^amSi which have *^ 

large CO rr el at ibn value r(G^MVEs) ^We know frbm. Table 10 that !iVE 

scbres of lessohs l04b, 201a, 20lb, 206c, 304, . 305, 307, 401, and 402 

have signifi^an^t^^"'^alues correlatibrii This analysis revealed that 

cbrrelatibri^Jif mastery time to gain scores, (irbntr ibiites the mb^'tV 

sighif icahj^\ in predicting variable '9. Sine mastery timie .bf a lessbii. 

correlates i^jpiiy ^th aptitiide scores as shown* in Table A of: the 

Apppendix, tfTis^ jresult is- expected^ : . : 

The students af rebted ffldst by the decision of cutoff scores are mediocre 

students whbse scbres ate near the cutc^^fs, arid therefore thejr tend to 

be more bfteri mi^^lassified in either the positive br, negative way. The 

fact that the beta coefficient of variable 2 is -. 336 means that the 

smaller the values \^f Cq/xi^ the larger the contribution, to die gain will 

be; thus medibcte studerits have a greater chance of reipeating the 

lessbris sfrice the oljserved cutbff c was set to 80% across all lIVEs,, , 
■ *• * ■ • 

- ■ ■ ■ . ^. - - ■ • ■ 

which is the 'ttue nastery level that? was ai^ fbt.' . ' 




j variable , beta coefficient S.D. error 

21 . -.152 ■ .. .178 

.r(G, IWEs) _.224 .185 , . 

• . r(G; time)- . .305 .190 • . 

■. ■ -riti. 'bf items ' -.344 : il95 . \ - //. 1.966 x_ ' , L 

• ' tmeah-Co.)/h .314 '^a: " ' * .1^9- * 

Haltiple R = .6503 cdristant =- 1.09*^^^^^^^^^ FS^m = 077 * ' ■ ■ \ " • V 

*sign±fi-cant at p<.05- ' ^'significant at p<.10 

Table 13 shows the results of analysj^s when the criterion ^is 
variable ?V pfbbability ?(TT;^'i9) that 90% dr tnore of the students in the 
*next'page ' * ' ^. '■" 

popuiatiob frdm li^iibh our sample was drawn will achieve/ the 80% mastery 
level on the end of lessop test i ..;Five predictors among 1, 2, 

3*: 8* 9* 10, 11, 12v 13, 14v^ and /1 5 were selected. Thevvariabies . 
naft^r and' % fail were omitted because P(Tr>i 9) was derived from these 
two values in the sampli.' None of* the beta coeff ic iehts was 
significa'tit,^ but' we might be - able to say that P(7t>. 9) depends to some 

extent bri the test length •(beta=r« 344, ^=1. 97 ) • Alsbi the distance of , / 

'" ■ ' ' . ■ ■■ ■ ■ ■ ■ ^ ■*.■■**■.'■■ ' ■ ■ 

the mean from the observed cutoff bo affects the vaLu^ j/f p|CTt>- 9) such 
thatt if the observed cutdff Cq is considerably smaller than the me 
thin* the success rate of the lesson becomes* larger'. This me^hs that .the 



test wa's probably too easy in comparison with, other tests.-" This ^ - * 
analysis /result confirms that-the validation criterion used a^t the itATO 
AFB CBE program at Chanute Air Force Base depended exclBSSiyely on the 
te^, the characteristics of- tfVE; hence the mfethbd that was used tb 
assess the quality b£ lessons ras inadequate. There is a great need for 



- -the develbpmeht of a nethbd to validate lessons d It ectly^ without 
depending entirely, on th i^$son tests i 
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SUMMARY AND DISCUSSION 



The problem of setting a validation criterion for a given lesson is . 
iripbrtaht in practice, but it has never becbrae a fbciis fbr educational ^ 
resdarchers^ although the clbsely related topic bf criteribn referenced 
test has been one of the most popular research targets in the past few 
years. Both the sample binomial model and the Ba yes iffl\ binomial model, 
(beta binbraial , model) ^^^e adopted to set a better validation criterion 
for a given lessbh and the result frbm the latter mbdel matched bur data 
better than did the fprmer. Therefbre^ the predictibh bf the fuTure 
success rate of the lesson using the Bayesian binomial model is 
recotjiriended for setting a validation criterion, when (a) the information 
is limited to the percentage of failure (or success) rate on the end of 



the lessbn test and (b) ah '^uthbr (or ins true tbr) bf the lesson bias a 
certain level of prior belief as to what extent his/her lessbh will be 
successful^ If the scores of a test given at the end of a lesson are 
available, then it is recptninended to use the information that one can 
get frotn the test^ perfbrmahce as much as^fessible upbh setting a 
Validation criterion of the lesson. 'Applying the beta tihimial mbdel bf 
- criterion-referenced: testing, the estimated probability of the observed* 
is c or e jT^^iHg larger th the observed cutoff c will be a better 
validation c-riteridn than the* success^ate^: Itt-bther words, the 
probability bf mastery^ passihg the criteribh scbr,e ^^^^ill serve as a 
' validation criterion of the lesson. . 

Of course, the decision of mastery or non-mastery must 
theoretically be based on a stuaent's true performance level and not on. 



tJHe observed scores, but* the true score will never be available in 
ptacticei But it is possible to Estimate the probability of the true 
: Score being greater than or equal to a givehV tru^ ma^ter^ 
80%. Unfortunately^, ddh' t have any analytical method to determine 
the best, most suitable true mastery level for a.prpgrami 

The four kinds, of probabilities ~ correct pass (A), correct fail 
(B) , false^ positive (F+) and false negative (F-) - were calculated over 
2 7 Mastery Val:idatix)_n' Examinations (a) when the observed cutoff c, (80%. 
correct) and (5) when the optiapum. cut "-P^r--:.^ ^- . 

misciassif icatio.n of students^ was usedi The results indicate that even 
if Cq weire used: m^ t^^ process, some tesfe^^^ill show 

substantially large hiimbers of inisqlij^sificatibrls of both, the fal^e 
positive and false negative tjrpes. Slmr^jit is ijiteresting to - 
investigate why s&tae tests showed as much as about % of 
misclassif ic^ti{i.l^^^j^^^^ other tests- showed very little, three stepwise 
multiple regression used , to seieijt the .predictors of 

P(F+)i P(F-h arid P(f+;or F^) separately. The commdri strongest 
predictof was the distance of Cq from the mean of a test, which was vdiat 
we expectedi the second common predictor was 21> intey^nal 
consistency of a criterion referenced test. As 21 inc^re-ases to 1, all 
three criterion variables get smaller ^ hence less misclassificatidns^^v; v 
occur. TtS&t means the iriterhal consistency of the items in a given test 
is . important ;,to control faise positive and false negative errors. 

The optimum cutoff c5's for flastery Validation Exams are smaller 

thaii ©r equal tp, the actually, used observed cutoff c's in almost all 

«-»• • • . _ 1^ >. . _ . . _ ■• i _ ._ ._ 

cases lit the PLATO AFB CBE pirbject. Therefore the probabilities of , 
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false negative associated with are smaller than or equal' to those of 
false negative associated with the observed cutoff c» But the 
probabilities of false positive associated witK co, tend to be larger 
that! those, associated with c. Since we ^et the loss ra&io to 1 in* this 
c^se, the total probability of misclass^fication is always miniiized by 
.using ^the opt iaum Gutd in some te$C is • eight times as large 

as P(Pt) i While : in others th^ . f otmer i^ larger^; ; 

Setting the most appropriate loss ratio will be a problem when Huynh' s 
mettod to evaluate, the optimum cutoff is adopted i Also ^ his method is 
more *^^itive for the smaller loss ratios than larger ones, say Q-iO- 

Our d^ta -gj^bwed thit^ Master Validation Examinations of the 

end^bf-le^sbh tests had the same bptimal cutbif Cq fbf' IWs ratios * 

between 8 aijd 20. , If his intention was to contrqi^'the fal^e pbsi'Sive 

■.: ■■ .: ' -• ■ ,^ ' ■ ■' . ' 

errors upon the decision of m as t^ry-hon mastery for a linearly related / 
curricjuluni such as mathema-ti<:s , theti.^;*tiie applicability of the method in ^ 
edi^atibaal settings will; bfe a problem. 
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APPENDIX 




V 




Corr 



First 



■ Table' A '-' ' ; v'^^"-' t^'' ' 

iOTis of Aptitude Sco'tes with MVE.Scdres; - ; 
Time, iMastery Time, and Test Cbtnpietion Time 



' ~ -tesson 

103 
'104a 
lOAb 
105 

201b 
. ' 202a . 
^ 202b: 
204 - 
205a 
.. 205b 
206a 
A 206b 
206c 
207 



303 
3d4 
30^5 
307 
308 
401 
402 

403 / 

404 -V 

< ■ / 



4D5b 
4d5c 



< .05 



. MVE 
scores 



fst 

coittpletion 



•^r-f Mastery 
: time 



.45 ••• V.39 
•17 -;40 
time data was lost 
.31 

* 

.52 
.16 - 



.38 

*. 

.34 ■ 
.19 

* 

.39 

.47 V 

;42 

.27 

.24 

.24 , 

.24 

.16 ' 

* 

.60 
.17 

* 

.52 

.20 
• * 

* 

.47 

* 

.48 
.10 
.27 
.05 
.31 



-.42 
-.12 
-.19 
-.16 

■ . * 

- .-38 
-.00 
-.03 
-.25 
.02 
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* 

-.08v 

4 - 

1* 

-.38 




-.42. 

-.25 s.;.'. 

-.19- 

-.22 
-.45 • 
-.27 
-.14 . 

-.02'-* . 



Test •> 

completion 
time 

- -* ■ ■•■ 
-.32 

'-.06 



-.32 

* 

-.32 
-.33 
-.10 
-.42 
-.26 
-;-32 

-.20 



-.22 



-.40 



-.23 


-.26 


-.15 






* 


-.03. 


-.13 


'-•34 


* 

-.39 


-.26 


-.19 






* 


-.14 


-.36 


-.51 




* 


* 


-.35 


-.36 


-.45 


* 

-;54 


■•,.„--*-. 59 


* 

-.57 






* 


-^00. 


"' -.03 


-.54 




- * V- 


■ 


-.41 . 


-.41 


.'■■^'^ ^.39 






■ * 


-..27 • '. 


-.39 


■:„.■..^.^ -.39 




* 




-.24 


-.31 


, .09 






, ■ - .* 


-. 27 


-.27 


-.32 


-; 15 


-.27 


', -- . 

.12 


-.03 


-.19 


-.05 


-.11 


-.06 


;02 



rABLE B 



^ ^ pesdription of: Content&. ±n tfe ^ssbn'| bf Chanute" 




iessbri 



104a:i 

10415/ 

■'<> • 

»• 105 

\ • ' 202b \ 
203a' 

203c 




'Principles o£^^' Gas Engine 



Identification pf Parts and Pui^)bse of 
Gasoiine Engine Cbmptessbr 

Cqbling System . A 
Air i^d Exhaust System S , . 

Piiridk^ of El^ct^city- a 

Batteries"; ' 



Electrical 'Sq 



itics 



l^ng Motors i DC 
:/ pharging System 



Battery Ignitibri 

Emission Control 

J' 

Diesel Engines . 
. Lighting Systettf- 
WarniSg Syst^S 
^. ^-^lutches .XV- ^ 
- ^Jp' \.Basic Hydraaiics. 



« 2 



U 



'Fluid Coiiplingsy^br^ue^^JS^ 
V-Jbirits/Propeller ShaiEts , ' ; 
Differentials V ir ! 

Transfer Case/PTO ■ 
Suspension System 
Hydr^iiiic and Mechanical..,^rakes 
iVir Brakes 

Power Assisted Brakes 
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